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L'algebre est genereuse ; elle donne souvent plus qu 'on lui demande. (d' Alembert) 



Abstract. Sturm's famous theorem (1829/35) provides an elegant algorithm to count 
and locate the real roots of any given real polynomial. In his residue calculus of complex 
functions, Cauchy (1831/37) assimilated this to an algebraic method to count and locate 
the complex roots of any given complex polynomial. We give a real-algebraic proof of 
Cauchy's theorem starting from the mere axioms of a real closed field, without appeal to 
analysis. This allows us to algebraicize Gauss' geometric argument (1799) and thus to 
derive a real-algebraic proof of the Fundamental Theorem of Algebra, stating that every 
complex polynomial of degree n has precisely n complex roots. The proof is elementary 
inasmuch as it uses only the intermediate value theorem and arithmetic of real polynomials. 
It can thus be formulated in the first-order language of real closed fields. Moreover, the 
proof is constructive and immediately translates to an algebraic root finding algorithm. 
The latter is sufficiently efficient for moderately sized polynomials, but in its present form 
it still lags behind Schonhage's nearly optimal numerical algorithm (1982). 



1. Introduction and statement of results 

1.1. Historical origins. Sturm's theorem [53, 54], announced in 1829 and published in 
1835, provides an ingeniously simple algorithm to determine for each real polynomial 
P £ M.\X] the number of real roots in any given interval [a^b] C M. Sturm's result solved 
an outstanding problem of his time and earned him instant fame. In 1831/37 Cauchy [8, 9] 
extended this algebraic method to determine for each complex polynomial F £ C[Z] the 
number of complex roots in any given rectangle [a,b\ x [c,d\ C C. 

Unifying the real and the complex case, we give a real-algebraic proof of Cauchy's 
theorem starting from the mere axioms of a real closed field, without appeal to analysis. 
From this we deduce an elementary, real-algebraic proof of the Fundamental Theorem of 
Algebra, stating that every polynomial F G C[Z] of degree n has precisely « roots in C. 
This classical theorem is of theoretical and practical importance, and our proof attempts to 
satisfy both aspects. Put more ambitiously, we strive for an optimal proof, which is at the 
same time elementary, elegant, and effective. 

The logical structure of such a proof was already outlined by Sturm in 1836, but his ar- 
ticle [55] lacks the elegance and perfection of his famous 1835 memoire. This may explain 
why his sketch found little resonance, was not further worked out, and became forgotten 
by the end of the 19th century. The contribution of the present article is to save the real- 
algebraic proof from oblivion and to develop Sturm's idea in due rigour The presentation 
is intended for non-experts and thus contains much introductory and expository material. 
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1.2. The theorem and its proofs. Before going into details let me outline the context. 
The Fundamental Theorem of Algebra states that every complex polynomial of degree n 
has precisely n complex roots, counted with multiplicity. More explicitly, we consider the 
field E of real numbers and its extension C = where = — 1. 

Theorem 1.1 (Fundamental Theorem of Algebra). For every polynomial 



with complex coefficients cq , ci , . . . , c„_ i G C there exist Zi,Z2, ■ ■ ■ ,Zn G C such that 



Numerous proofs of this theorem have been published over the last two centuries. Ac- 
cording to the tools used, they can be grouped into three major families (§8): 

(1) Analysis, using compactness, analytic functions, integration, etc.; 

(2) Algebra, using symmetric functions and the intermediate value theorem; 

(3) Algebraic topology, using some form of winding number 

The real-algebraic proof presented here is situated between (2) and (3) and combines 
Gauss' winding number with Sturm's theorem. It enjoys several remarkable features: 

• It uses only the intermediate value theorem and arithmetic of real polynomials. 

• It is elementary, in the colloquial as well as the formal sense of first-order logic. 

• All arguments and constructions extend verbatim to all real closed fields. 

• The proof is constructive and immediately translates to a root finding algorithm. 

• The algorithm is easy to implement and reasonably efficient in medium degree. 

• It can be formaUzed to a computer-verifiable proof (theorem and algorithm). 

Each of the existing proofs has its special merits. I do not claim the real-algebraic proof 
to be the shortest, nor the most beautiful, nor the most profound one, but it is arguably 
short, beautiful, and profound. Overall it offers an excellent cost-benefit ratio. 

1.3. Algebraic index theory. Our arguments work over every ordered field (R, +, -,<) 
that satisfies the intermediate value property for polynomials, i.e., a real closed field (§2). 
We choose this starting point as the axiomatic foundation of Sturm's theorem (§3). 

We then deduce that the field C = R[/] with = — 1 is algebraically closed, and more- 
over establish an algorithm to locate the roots of any given polynomial F G C [Z] . The key 
ingredient is the construction of an algebraic index (§4-§5), extending the ideas of Cauchy 
[8, 9] and Sturm [54, 55] in the setting of real algebra: 

Theorem 1.2. Consider an ordered field R and its extension C = R[/]. Let Q. be the set of 
piecewise polynomial loops y. [0, 1] — > C*, 7(0) = 7(1). /f R is real closed, then we can 
construct a map ind: called algebraic index, satisfying the following properties: 

(10) Computation: ind(7) is defined as the Cauchy index of recalled below, and 
can thus be calculated by Sturm's algorithm via iterated euclidean division. 

(11) Normalization: if J parametrizes the boundary of a rectangle F C C, positively 
oriented as in Figure 1, then 



(12) Multiplicativity: for all 71 , 72 G i2 we have 

ind(7i ■ 72) = ind(7i) +ind(72). 

(13) Homotopy invariance: for all 70, 71 G i2 w have 

ind(7o) = ind(7i) whenever 7o~7i, 
that is, whenever 70 and 71 are (piecewise polynomially) homotopic in C*. 



F = Z" + c„_ iZ"-i + • • • + ciZ + CO 



F={Z-zi)iZ-Z2)---iZ-z„). 
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The geometric idea is very intuitive: ind(7) counts the number of turns that the loop 
/performs around (see Figure 1). The difficulty is to prove the existence of an index 
having all the desired properties. Theorem 1.2 turns the geometric idea into an algebraic 
construction and provides an effective calculation via Sturm chains. 

Remark 1.3. The construction of the algebraic index is slightly more general than stated 
in Theorem 1.2. The algebraic definition (10) of ind(7) also applies to loops 7 that pass 
through 0. Normalization (II) extends to ind(7) = | if is in an edge of F, and ind(7) = ^ 
is is one of the vertices of F. Multiplicativity (12) continues to hold provided that is not 
a vertex of 71 or 72. Homotopy invariance (13) applies only if 7 does not pass through 0. 

Remark 1 .4. The existence of the real-algebraic index over R relies on the intermediate 
value theorem for polynomials. (Such an index does not exist over Q, for example.) Con- 
versely, its existence implies that C ~ R[i] is algebraically closed and hence R is real closed 
(see Remark 2.5). More precisely, given any ordered field K, Theorem 1.2 holds for the 
real closure R = K*^ (see Theorem 2.4): properties (10), (II), (12) can be restricted to loops 
over K, and it is the homotopy invariance (13) that is equivalent to K being real closed. 

Remark 1.5. Over the real numbers M, several alternative constructions are possible: 

(1) Covering theory, applied to exp: C C* with covering group Z. 

(2) Fundamental group, ind: n\ (C*, 1) ^ Z via Seifert-van Kampen. 

(3) Homology, ind: Hi (C*) ^ Z via Eilenberg-Steenrod axioms. 

(4) Complex analysis, analytic index ind(7) = tj^ Jy^ via integration. 

(5) Real algebra, algebraic index ind: i2 Z via Sturm chains. 

Each of the first four approaches uses some characteristic property of the real numbers 
(such as local compactness, metric completeness, or connectedness). As a consequence, 
these topological or analytical constructions do not extend to real closed fields. 

Remark 1.6. Over C the algebraic index coincides with the analytic index of piecewise 
continuously differentiable loops 7: [0, 1] C* given by Cauchy's integral formula 



This is called the argument principle and is intimately related to the covering map 
exp: C ^ C* and the fundamental group 7ri(C*, 1) = Z. Cauchy's integral (1.1) is the 
ubiquitous technique of complex analysis and one of the most popular tools for proving 
the Fundamental Theorem of Algebra. 

In this article we develop an independent, purely algebraic proof avoiding integrals, 
transcendental functions, and covering spaces. Seen from an elevated viewpoint, our ap- 
proach interweaves real-algebraic geometry and effective algebraic topology. In this gen- 
eral setting Theorem 1.2 and its real-algebraic proof seem to be new. 

Remark 1.7. The index can be generalized to any finite dimension, traditionally called 
Kronecker index or Brouwer degree. Apart from the fundamental group, which works only 
in dimension « — 2, all of the above constructions have higher-dimensional counterparts: 

(1) Simplicial Homology, deg: //„_i (K" \ {0}) ^ Z. 

(2) De Rham cohomology, deg : H'^t^^M" \ {0}) ^ Z. 

(3) Differential topology, mapping degree via Sard's theorem. 

(4) Simplicial approximation, mapping degree via Sperner's lemma. 

(5) It is possible to likewise extend the real-algebraic approach. 

Since the construction of higher-dimensional indices presents its own specific subtleties, 
we will postpone it to a forthcoming article [15]. It can be hoped that the real-algebraic 
index will prove as useful for real algebra as its topological counterpart is for topology. 



(1.1) 
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1.4. The Fundamental Theorem of Algebra. I have highlighted Theorem 1.2 in order 
to summarize most clearly the real-algebraic approach, combining geometry and algebra. 
The first step in the proof (§4) is to study the algebraic index md{F\dr) of a polynomial 
F gC[Z] along the boundary of a rectangle F C C, positively oriented as in Figure 1. 

Example 1.8. Figure 1 (right) displays F{dr) for F = - 52^ - 22^ - 22^ - 3Z - 12 and 
r = [— 1,+1] X [—1,+!]. Here the index is seen to be md{F\dr) — 2. 
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Figure 1 . The index ind(F|(9F) of a polynomial F G C[Z] with respect 
to a rectangle F C C, geometrically interpreted as the winding number 



We then establish the algebraic generalization of Cauchy's theorem, extending Sturm's 
theorem from real to complex polynomials: 

Theorem 1.9. If F £ C[Z] does not vanish in any of the four vertices of the rectangle 
F C C, then the algebraic index ind(F|(9F) equals the number of roots ofF in T: 

• Each root ofF in the interior ofT counts with its multiplicity. 

• Each root ofF in an edge ofY counts with half its multiplicity. 

Remark 1.10. The hypothesis that F on the vertices is very mild and easy enough to 
check in every concrete application. Unlike the integral formula (1.1), the algebraic index 
behaves well if zeros lie on (or close to) the boundary. This is yet another manifestation of 
the oft-quoted wisdom of d'Alembert that algebra is generous; she often gives more than 
we ask of her. Apart from its aesthetic appeal, the uniform treatment of all configurations 
simplifies theoretical arguments and practical implementations alike. 

The second step in the proof (§5) formalizes the geometric idea of Gauss' dissertation 
(1799), which becomes perfectly rigorous and nicely quantifiable in the algebraic setting: 

Theorem 1.11. For each polynomial F ^CnZ" +c„_iZ"^' H hco in C[Z] with Cnj^Qwe 

define its Cauchy radius to be Pf := I +max{|co|, . . . , |c„_i |}/|c„|. Then every rectangle 
F containing the disk B{pf) = {z G C | |z| < r} satisfies md{F\dT) ~ n. 

Theorems 1 .9 and 1.11 together imply that C is algebraically closed: each polynomial 
F G C[Z] of degree n has n roots in C, each counted with its multiplicity; more precisely, 
the square F = [—pf,pf]^ C C contains n roots of F. 

Applied to the field C = M.[i] of complex numbers, this result is traditionally called the 
Fundamental Theorem of Algebra, following Gauss, although nowadays it would be more 
appropriate to call it the "fundamental theorem of complex numbers". 

We emphasize that the algebraic approach via Cauchy indices proves much more than 
mere existence of roots. It also establishes a root finding algorithm (§7.2): 

Theorem 1.12 (Fundamental Theorem of Algebra, effective version). For every polyno- 
mial F G C[Z] of degree n> 1 there exist c,zi , . . . ,z„ G C such that 

F = c{Z-zi)---{Z-Zn). 
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The algebraic index provides an explicit algorithm to locate all roots zi , . . . ,z„ ofF: start- 
ing from some rectangle containing all n roots, as in Theorem 1.11, we can subdivide and 
keep only those rectangles that actually contain roots, using Theorem 1.9. All index com- 
putations can be carried out using Sturm chains according to Theorem 1.2. By iterated 
bisection we can thus approximate all roots to any desired precision. 

Once sufficient approximations have been obtained, one can switch to Newton's method, 
which converges much faster but depends on good starting values (§7.3). 

Remark 1.13. In the real-algebraic setting of this article we consider the field operations 
{a,b) ^ a + b, a ^ —a, {a,b) ^ a ■ b, a ^ a^^ in R and the comparisons a = b, a < b 
as primitive operations. In this sense our proof yields an algorithm over R. Over the 
real numbers M this point of view was advanced by Blum-Cucker-Shub-Smale [6] by 
extending the notion of Turing machines to hypothetical "real number machines". 

In order to carry out the required real-algebraic operations on a Turing machine, how- 
ever, a more careful analysis is necessary (§7.1). At the very least, in order to implement 
the required operations for a given polynomial F = co + c\Z H V c„Z", we have to as- 
sume that for the ordered field Q(re(co),im(co), . . . ,re(c„),im(c„)) the above primitive 
operations are computable in the Turing sense. See §7 for a more detailed discussion. 

1.5. Why yet another proof? There are several lines of proof leading to the Fundamental 
Theorem of Algebra, and literally hundreds of variants have been pubUshed over the last 
200 years (see §8). Why should we care for yet another proof? 

The motivations for the present work are three-fold; 

First, on a philosophical level, it is satisfying to minimize the hypotheses and the tools 
used in the proof, and simultaneously maximize the conclusion. 

Second, when teaching mathematics, it is advantageous to have different proofs to 
choose from, adapted to the course's level and context. 

Third, from a practical point of view, it is desirable to have a constructive proof, even 
more so if it directly translates to a practical algorithm. 

In these respects the present approach offers several attractive features: 

(1) The proof is elementary, and a thorough treatment of the complex case (§4-§5) is 
of comparable length and difficulty as Sturm's treatment of the real case (§2-§3). 

(2) Since the proof uses only first-order properties (and not compactness, for example) 
all arguments hold verbatim over any real closed field (§2.3). 

(3) The proof is constructive in the sense that it establishes not only existence but also 
provides a method to locate the roots of F (§7.2). 

(4) The algorithm is fairly easy to implement on a computer and sufficiently efficient 
for medium-sized polynomials (§7.4). 

(5) Its economic use of axioms and its algebraic character make this approach ideally 
suited for a formal, computer-verified proof (§7.6). 

(6) Since the real-algebraic proof also provides an algorithm, the correctness of an 
implementation can likewise be formally proved and computer-verified. 

1.6. Sturm's forgotten proof. Attracted by the above features, I have worked out the real- 
algebraic proof for a computer algebra course in 2008. The idea seems natural, or even 
obvious, and so I was quite surprised not to find any such proof in the modern literature. 
While retracing its history (§8), I was even more surprised when I finally unearthed very 
similar arguments in the works of Cauchy and Sturm (§8.4). Why have they been lost? 

Our proof is, of course, based on very classical ideas. The geometric idea goes back to 
Gauss in 1799, and all algebraic ingredients are present in the works of Sturm and Cauchy 
in the 1830s. Since then, however, they have evolved in very different directions: 

Sturm's theorem has become a cornerstone of real algebra. Cauchy's integral is the 
starting point of complex analysis. Their algebraic method for counting complex roots. 
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however, has transited from algebra to appHcations, where its conceptual and algorithmic 
simplicity are much appreciated. Since the end of the 19th century it is no longer found in 
algebra text books, but is almost exclusively known as a computational tool, for example 
in the Routh-Hurwitz theorem on the stability of motion (§6.1). After Sturm's outline of 
1836, this algebraic tool seems not to have been employed to prove the existence of roots. 

In retrospect, the proof presented here is thus a fortunate rediscovery of Sturm's ne- 
glected vision (§8.5). This article gives a modern, rigorous, and complete presentation, 
which means to set up the right definitions and to provide elementary, real-algebraic proofs. 

1.7. How this article is organized. Section 2 briefly recalls the notion of real closed 
fields, on which Sturm's theorem and the theory of Cauchy's algebraic index are built. 

Section 3 presents Sturm's theorem [54] counting real roots of real polynomials. The 
only novelty is the extension to boundary points, which is needed in Section 4. 

Section 4 proves Cauchy's theorem [9] counting complex roots of complex polynomials, 
by establishing the multiplicativity (12) of the algebraic index. 

Section 5 establishes the Fundamental Theorem of Algebra via homotopy invariance 
(13), recasting the classical winding number approach in real algebra. 

Section 6 highlights two further applications of the algebraic index: the Routh-Hurwitz 
stability theorem and Brouwer's fixed point theorem over real-closed fields. 

Section 7 discusses algorithmic aspects, such as Turing computability, the efficient com- 
putation of Sturm chains and the cross-over to Newton's local method. 

Section 8, finally, provides historical comments in order to put the real-algebraic ap- 
proach into a wider perspective. 

The core of our real-algebraic proof is rather short (§4-§5). It seems necessary, however, 
to properly develop the underlying tools and to arrange the details of the real case (§2-§3). 
The algorithmic and historical aspects (§7-§8) complete the picture. Overall, I hope that 
the subject justifies the length of this article and its level of detail. 

2. Real closed fields 

There can be no purely algebraic proof of the Fundamental Theorem of Algebra in the 
sense that ordered fields and the intermediate value property of polynomials must enter the 
picture (see Remark 2.5 below). This is the natural setting of real algebra. 

We shall use only elementary properties of ordered fields, which are well-known from 
the real numbers (see for example Cohn [11, §8.6-§8.7]). In order to make the hypotheses 
precise, this section sets the scene by recalling the notion of a real closed field, on which 
Sturm's theorem is built, and sketches its analytic, algebraic, and logical context. 

2. 1 . Real numbers. As usual we denote by K the field of real numbers, that is, an ordered 
field (M, +,-,<) such that every non-empty bounded subsetA C Rhas a least upper bound 
in R. This is a very strong property, and in fact it characterizes M: 

Theorem 2.1. For an ordered field R the following conditions are equivalent: 

(1) The ordered set (R, <) satisfies the least upper bound property. 

(2) Each interval [a,b\ C R is compact as a topological space. 

(3) Each interval [a,b\ C R is connected as a topological space. 

(4) The intermediate value property holds for all continuous functions f: R ^ R. 
Any two ordered fields satisfying these properties are isomorphic by a unique field iso- 
morphism. The construction of the real numbers M shows that such afield exists. □ 

Proof. Existence and uniqueness of the field K of real numbers form the foundation of any 
analysis course. Most analysis books prove (1 ) ^ (2) ^ (4), while (3) <^ (4) is essentially 
the definition of connectedness. Here we only show (4) (1), in the form =^ ~'(4). 

Let A C R be non-empty and bounded above .Define/: R— >{±l}by/(x) = lifa<x 
for all a € A, and /(x) = — 1 if x < a for some a G A. In other words, we have f{x) = 1 if 
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and only if x is an upper bound. If / is discontinuous in x, then f{x) = +1 but f{y) = — 1 
for all y < X, whence x = sup A. If A does not have a least upper bound in R, then / is 
continuous but does not satisfy the intermediate value property. □ 

2.2. Real closed fields. The field M of real numbers provides the foundation of analysis. 
In the present article it appears as the most prominent example of the much wider class of 
real closed fields. The reader who wishes to concentrate on the classical case may skip the 
rest of this section and assume R = M throughout. 

Definition 2.2. An ordered field (R, + , - ,<) is real closed if it satisfies the intermediate 
value property for polynomials: whenever a polynomial P G R[^] satisfies P{a)P{b) < 
for some a < binR, then there exists x G ]a,b[ such that P{x) = 0. 

For example, the field M. of real numbers is real closed by Theorem 2. 1 above. The field 
Q of rational numbers is not real closed, as shown by the example P = — 2 on [1,2]. 
The algebraic closure Q'^ of Q in M is a real closed field. In fact, Q'' is the smallest real 
closed field, in the sense that is contained in any real closed field. Notice that Q' is 
much smaller than R, in fact Q'' is countable whereas R is uncountable. 

Remark 2.3. In the context of this article. Definition 2.2 is the natural starting point because 
it captures the essential geometric feature of real-algebraic geometry. It deviates, however, 
from Artin's algebraic definition, which says that an ordered field is real closed if no proper 
algebraic extension can be ordered. See Cohn [11, Prop. 8.8.9] for their equivalence. 

Not all interesting ordered fields are subfields of M; the field M.{X) of rational func- 
tions can be ordered (in many different ways, see [27, §11.9]) but does not embed into R. 
Nevertheless, like every ordered field, it can be embedded into some real closure: 

Theorem 2.4. Every ordered field K admits a real closure, i.e., a real closed field R Z) K 
that extends the ordering and is algebraic over K. Any two real closures R, R' o/K are 
isomorphic via a unique field isomorphism R ^ 'R! fixing K. □ 

Notice that the real closure is much more rigid than the algebraic closure. For more 
on real closed fields see Jacobson [26, chapters 1.5 and 11.11], Cohn [11, chapter 8], and 
Knebusch-Scheiderer [27, Kapitel I]. We only mention that in a real closed field R every 
positive element has a square root. As a consequence the ordering on R can be character- 
ized in algebraic terms: x > if and only if there exists r G R such that r~ = x. In particular, 
if a field R is real closed, then it admits precisely one ordering. 

Remark 2.5. As outlined in the introduction, we wish to show that if a field R is real closed, 
then C = R[/] is algebraically closed. Conversely, Artin and Schreier [4] have proved that 
if C is algebraically closed and contains a subfield R such that 1 < dimR(C) < oa, then R 
is real closed and C = R[/]. We shall not use this striking result, but it underlines that we 
have chosen minimal hypotheses. 

2.3. Elementary theory of ordered fields. The axioms of an ordered field (R, +, -,<) 
are formulated in first-order logic, which means that we quantify over elements of R, but 
not over subsets, functions, etc. By way of contrast, the characterization of the field R of 
real numbers (Theorem 2. 1) is of a different nature: here we have to quantify over subsets 
of R, or functions M ^ M, and such a formulation requires second-order logic. 

The algebraic condition for an ordered field to be real closed is of first order It is given 
by an axiom scheme where for each degree « G N we have one axiom of the form 

(2.1) Va,Z7,co,ci, . . . ,c„ G R [{cQ + c\a -\ Vc„a")[cQ+c\b^ he,,/?") < 

^ 3x G R((x-fl)(x-fo) < A co + cixH hc„x"=0)]. 
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Remark 2.6. First-order formulae are customarily called elementary. For a given ordered 
field R, the collection of all first-order assertions that are true over R is called the ele- 
mentary theory of R. Tarski's theorem [26, 7] says that all real closed fields share the 
same elementary theory: if an assertion in the first-order language of ordered fields is true 
over one real closed field, for example the real numbers, then it is true over any other 
real closed field. (This no longer holds for second-order logic, where K is singled out.) 
Tarski's theorem is a vast generalization of Sturm's theorem, and so is its effective formu- 
lation, called quantifier elimination, which provides explicit decision procedures. We wiU 
not use Tarski's theorem; it only serves to situate our approach in its logical context. 

From Tarski's meta-mathematical viewpoint it is not surprising that the statement of the 
Fundamental Theorem of Algebra generalizes to an arbitrary real closed field, because in 
each degree it is of first order It is remarkable, however, to construct a first-order proof that 
is as direct and elegant as the second-order version. The real-algebraic proof presented here 
achieves this goal and, moreover, is geometrically appealing and algorithmically effective. 

3. Sturm's theorem for real polynomials 

This section recalls Sturm's theorem for real polynomials - a gem of 19th century alge- 
bra and one of the greatest discoveries in the theory of polynomials. It seems impossible to 
surpass the elegance of the original memoires by Sturm [54] and Cauchy [9]. One technical 
improvement of our presentation, however, seems noteworthy: 

The inclusion of boundary points streamlines the arguments so that they will apply 
seamlessly to the complex setting in §4. The necessary amendments render the develop- 
ment hardly any longer nor more complicated. They pervade, however, all statements and 
proofs, so that it seems worthwhile to review the classical arguments in full detail. 

3.1. Counting sign changes. Consider a finite sequence s ~ [sq, . . . ,s„) in R. We say that 
the pair {sic^i,sii) presents a sign change if Sk^iSj^ < 0. The pair presents half a sign change 
if one element is zero while the other is non-zero. In the remaining cases there is no sign 
change. All cases can be subsumed by the simple formula 

(3.1) V{sk_uSk) = 5|sign(i<._i)-sign(5i)|. 

Here we define sign: R^{ — l,0,+l}as usual by sign(jc) = +1 if .jc > 0, sign(x) = — 1 if 
X < 0, and sign(O) = 0. 

Definition 3.1. For a finite sequence s ~ {so, . . . ,s„) in R the number of sign changes is 

n n 

(3.2) V{s) := ^y(ii:-i,iA.) = 5|sign(iA-i) -sign(i^.)|. 

k=l k=l 

For a finite sequence {So, . . . ,Sn) of polynomials in R[X] and a G R we set 

(3.3) Va{SQ,...,S„) ■.= V{Soia),...,S„{a)). 

For the difference at two points a,^ e R we use the notation := V,, — Vi,. 

Remark 3.2. The number V{sq, . . . ,s„) does not change if we multiply all sq, ... ,Sn by some 
constant ^ G R*. Likewise, Vj'(5'o, . . . ,S„) remains unchanged if we multiply all ^o, . . . ,S„ 
by some polynomial Q G R[X]* that does not vanish in {a,b}. 

Remark 3.3. There is no universal agreement for counting sign changes because each ap- 
plication requires its specific conventions. While there is no ambiguity for si^^isi; < and 
^k-i^k > 0, some arbitration is needed to take care of possible zeros. Our definition has 
been chosen to account for boundary points in Sturm's theorem, as explained below. 

The traditional way of counting sign changes, following Descartes and Fourier, is to 
extract the subsequence s by discarding the zeros of s and to define V{s) := V{s). This 
renders the counting rule non-local, whereas in the above formula only neighbours interact. 
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3.2. The Cauchy index. Index theory is based on judicious counting. Instead of counting 
zeros of ^ it is customary to count poles of ^, which is of course equivalent. We thus 
begin by recalUng the Cauchy index counting poles of rational functions. 

Definition 3.4. We denote by \\m^ f and lim^ / the right and left limit, respectively, of a 
rational function / e R(X)* in a point a e R. The Cauchy index of / in a is defined as 

r+i iflimf/ = +- 

(3.4) Ind„(/):=Ind+(/)-Ind;(/) where Ind^(/) := <^ -i if lim^/ = — , 

[O otherwise. 



Less formally, we have Inda(/) = +1 if /jumps from —°o to +°o, and Inda(/) = — 1 
if / jumps from +°° to — oo, and Inda(/) ~ in all other cases. For example, we have 
Indo(i) = +1 andlndo(-i) = -1 and Indo(±4y) = 0. 







> 1, 
+ 1/2 J 


^y^+1/2^ 






a 

Ind=+1 


a 

/-1/2 
1 Ind=-1 


a 

-l/2\ 

lnd=0 1 


a 

/-1/2 
1 lnd=0 



Remark 3.5. The limits lim*/ are just a convenient notation for purely algebraic quanti- 
ties: we can factor f ~ {X — a)"'g with m e Z and g G R{X)* such that g{a) G R*. 

• If m > 0, then Umf / = for both e G {+, -}. 

• If m = 0, then limf,f^g{a) for both e G {+,-}. 

• If m < 0, then Umf / = £"'• sign^(fl) • {+°°). 

In the first two cases / is continuous in a or can be continuously extended to a, and 
Indf (/) = 0. In the last case / has a pole of order \m\ in a, and Ind^(/) ~ \e"' ■ signg(fl). 

Definition 3.6. For a < in R we define the Cauchy index of / G R(X)* on the interval 

[a,b\ by 

(3.5) Ind*(/):=Ind^(/)+ ^ Ind, (/) - Ind, (/) . 

The sum is well-defined because only finitely many x €]a,b[ contribute. 
¥orb<a we define Ind^'(/) := - Indj;(/), and fox a = b we set MK/) := 0. 
Finally, we set Ind*(^) : = in the degenerate case where = or 5 = 0. 

Remark 3.7. We opt for a more comprehensive definition (3.5) than usual, in order to take 
care of boundary points. We will frequently bisect intervals, and this technique works best 
with a uniform definition that avoids case distinctions. Moreover, we will have reason to 
consider piecewise rational functions in §4. 

Proposition 3.8. The Cauchy index enjoys the following properties ( which formally re- 
semble the properties of integration): 

(a) bisection: Ind^ (/) + Ind^ (/) = Ind^X/) for all a,b,c G R. 

(b) invariance: Ind'^(/oT) — Ind^|^|(/) for every linear fractional transformation 

t: [a,b] R, t(x) — where p,q,r,s G R, without poles on [a,b\. 

(c) addition: Ind^ {f + g) ~ Ind^ (/) + Ind^ (g) iff, g have no common poles. 

(d) scaling: Ind^(g/) = C7lnd^(/) if g\^^,i,^ is of constant sign (J E {±1}. □ 
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3.3. Counting real roots. The ring R[X] is equipped with a derivation P ^ P' sending 
each polynomial P = YJk=oPkX'' ^'^ formal derivative P' = T!k=i ^/'A-^^ This extends 
in a unique way to a derivation on the field R(X) sending / = | to /' = ^^7^. This is 
an R-linear map and satisfies Leibniz' rule (fg)' = f'g + fg'- For / e R(X)* the quotient 
/'// is called the logarithmic derivative of /; it enjoys the following property: 

Proposition 3.9. For every f g R(X)* we have Inda(/'//) = +1 if a is a zero off, and 
Ind„(/'//) = —1 if a is a pole off, and lnda{f' / f) = in all other cases. 

Proof Wehave/= (X -0)'"^ with m e Z and g € R(X)* such that g(fl) eR*. By Leibniz' 

f' i>' 

rule we obtain 4 = — h — . The last fraction does not contribute to the index because it 

J A —a g 

does not have a pole in a. We conclude that Inda (/'//) = sign(;7z). □ 

Corollary 3.10. For every f G R(X)* and a < b in'R the index Ind^(/'//) is the number 
of roots minus the number of poles of f in [a,b], counted without multiplicity. Roots and 
poles on the boundary count for one half. □ 

The corollary remains true for / = f when /? = or 5 = 0, with the convention that 
we count only isolated roots and poles. Polynomials P G R[^] have no poles, whence 
Ind^(P'/^) simply counts the number of (isolated) roots of P in [fl,fe]. 

3.4. The inversion formula. The intermediate value property of polynomials P G R[X] 
can now be reformulated quantitatively as Ind'^(j;) ~Vj^{l,P). More generally, we have 
the following result of Cauchy [9, §1, Thm. I]: 

Theorem 3.11. 7/"P, Q G R[^] have no common zero in a nor b, then 

(3.6) Ind^(|)+Ind^(^)=y,f(P,e). 

Proof. The statement is true if P = or 2 = 0, so we can assume P,Q E R[X]*. Equa- 
tion (3.6) continues to hold if we divide P,Q hy a common factor U G R[^], because our 
hypothesis ensures that U{a) ^ and U (b) ^ 0. We can thus assume gcd{P,Q) = I. 

Suppose first that [a,b] contains no pole. On the one hand, both indices Ind^(2^ and 
Ind^ ( ^ ) vanish in the absence of poles. On the other hand, the intermediate value property 
ensures that both P and Q are of constant sign on [a,b], whence Va{P, Q) ~ Vb{P, Q). 

Suppose next that [a,b] contains at least one pole. Formula (3.6) is additive with respect 
to bisection of the interval [a,b]. It thus suffices to treat the case where [a,b\ contains 
exactly one pole. Bisecting once more, if necessary, we can assume that this pole is either 
a or b. Applying the symmetry X ^ a + b — X, if necessary, we can assume that the pole 
is a. Since Formula (3.6) is symmetric in P and Q, we can assume that P{a) = 0. 

By hypothesis we have Q{a)^Q, whence Q has constant sign on [a , b] and Ind^ ( ^ ) =0. 
Likewise, P has constant sign on ]a,b] and Ind^(2) — Ind+(2y On the right hand side 
we find Va{P,Q) = \, and for Vh{P,Q) two cases occur; 

• lfVb{P,Q) = 0, then 2 > on ]a,b], whence lim+(S) = +oc. 

• \fVb{P,Q) = 1, then 2 < on ]a,b], whence lim+(S) = -00. 

In both cases we find Ind+ (^) = (P, Q), whence Equation (3.6) holds. □ 

3.5. Sturm chains. We can calculate the Cauchy index Ind^(^) by applying certain trans- 
formations to R and S. The essential condition is the following: 

Definition 3.12. A sequence of polynomials (5o, . . . ,S„) in R[X] is a Sturm chain with 
respect to an interval [a^b] C R if it satisfies Sturm's condition: 

(3.7) liSk{x) = forO < < « andx G [a,b], then Sk-\{x)Sk+\{x) < 0. 
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We will usually not explicitly mention the interval [a,b] if it is understood from the 
context, or if (^o, ■■■ ,Si,) is a Sturm chain on all of R. Moreover, we tacitly assume that 
« > 2; for n = I Condition (3.7) is void and should be replaced by the requirement that Sq 
and have no common zeros. 

Theorem 3.13. If {Sq , 5i , . . . , 5„_ i , 5„ ) is a Sturm chain in R[X], then 

(3.8) ind^(^)+ind^(^) = yj'(5o,5i, . . . ,5„_i,5„). 

Proof. The Sturm condition ensures that two consecutive functions Sk- 1 and Sk have no 
common zeros. For« = 1 Formula (3.8) reduces to the inversion formula of Theorem 3.1 1. 
For « = 2 the inversion formula implies that 

(3.9) Ind^d) +Ind;;(|) +Ind;;(|) +Ind^(|) = V.f (^o,^:,^^) . 

This is a telescopic sum: contributions to the middle indices arise at zeros of ^i, but at each 
zero of its neighbours and 52 have opposite signs, which means that the middle terms 
cancel each other Iterating this argument, we obtain (3.8) by induction on n. □ 

The following algebraic criterion for Sturm chains will be used in §3.6 and §5.1: 

Proposition 3.14. Consider a sequence {So, . . . ,S„) in R[X] such that: 

(1) AkSk+i+BkSk + CkSk-i=OwithAk,Bk,CkeR[X]forO<k<n. 

(2) Aii{x) 7^ and Aii{x)Ck{x) > whenever Sk{x) = Ofor < k <n. 

Then {So, . . . ,S,i) is a Sturm chain on [a,b] if and only if the terminal pair (5„_i,5„) has 
no common zeros in [a,b]. In particular, this is satisfied if S„ has no zeros in [a,b]. 

Proof. We assume that n > 2. If {Sn-i,Sn) has a common zero, then the Sturm condition 
(3.7) is obviously violated. Suppose that {Sn-\,S„) has no common zeros in [a,b]. If 
St{x) = for X G [a,b] and Q <k <n, then Sk+\{x) ^ 0. Otherwise Conditions (1) and 
(2) would imply that Sk, ■ ■ ■ ,S„ vanish in x, which is excluded by our hypothesis. Now 
the equation Ak{x)Sk+i {x) + Ci;{x)Sk^i{x) = together with Aii{x)Sk+i {x) ^ shows that 
Q(x) ^ and Sk-i ^ 0. Using A<.(x)Q(x) > we conclude that Sk-i{x)Sk+i {x) < 0. □ 

Remark 3.15. In Proposition 3.14 the essential condition is (1). One can then eliminate 
common zeros x G [a,b] and arrange the sign condition Ai (x)Cjt(x) > by multiplying the 
polynomials 5* and the coefficients A*,Bh,,Ch, by suitably factors ±(X —x)"'. 

Remark 3.16. The technical Condition (2) can be replaced by the simpler but more restric- 
tive condition that Ai^,Ck > on [a,b]. The linear relation (1) then resembles the mean 
value property of harmonic functions, here discretized to a graph in form of a chain. 

3.6. Euclidean Sturm chains. In the preceding paragraph we have seen the definition of 
Sturm chains and their main application to Cauchy indices. Everything so far is completely 
general and applies not only to polynomials. The crucial observation for polynomials is 
that the euclidean algorithm allows us to construct Sturm chains: 

Consider a rational function / = ^ G R(X)* represented by polynomials R,S G R[X]*. 
Iterated euclidean division produces a sequence of polynomials starting with P() = S and 
Pi = R, such that P^+i = QkPk-Pk-i with degP^^+i < degP<. for all /t = 1,2,3, .. .. This 
process eventually stops when we reach P„+i = 0, in which case P„ gcd(fo,A). 

Stated differently, this construction is the expansion into the continued fraction 



12 



MICHAEL EISERMANN 



Definition 3.17. Using the preceding notation, the euclidean Sturm chain {Sq,S\ , . . . 
associated to the fraction ^ is defined hy Sk '■= Pk/Pn for A: = 0, 1 , . . . , «. 

Notice that the chain {Sq,Si, . . . ,S„) depends only on the fraction | in the field R{X) 
but not on the pair {R,S) representing it. Division by P„ ensures that gcd(5o,5'i) = 5„ = 1 
but preserves the equations S^^i +Sk+i = QkSk for all < A: < «, and Proposition 3.14 
ensures that {Sq,S\ ,S„) is indeed a Sturm chain. 

Remark 3.18 (pseudo-euclidean division). If K is a field, then for every 5 G K[X] and 
P e K[X]* there exists a unique pair Q,R € K[X] such that 

(3.10) S = PQ-R and degR < degP. 

Here the negative sign has been chosen for the application to Sturm chains. This division 
works over every ring K provided that the leading coefficient c of f is invertible in K. In 
general, over every integral ring K, we can carry out pseudo-euclidean division; for all 
S e K[X] and P e K[X]* there exists a unique a pair Q,R e K[Z] such that 

(3.11) c''S = PQ-R and deg/? < degf, 

where c is the leading coefficient of P and d — max{0, 1 + degS — degf }. With a view to 
ordered fields it is advantageous to chose the exponent d to be even. This will be applied 
in §5.1 to the polynomial ring R[i',X] = K[X] over K = R[Y]. Even for Q[X] it is often 
more efficient to work in in order to avoid coefficient swell in calculations (see §7.4). 

3.7. Sturm's tlieorem. Using the euclidean algorithm for constructing Sturm chains, we 
can now fix the following notation: 

Definition 3.19. For f G R{X) and a,beR we define the Sturm index to be 

Sturm^(^) :=y,f(5o,5i,...,5„), 

where (5o , 5i , . . . , 5„ ) is the euclidean Sturm chain associated to ^ . We include two excep- 
tional cases: If 5 = and R ^Q, the euclidean Sturm chain is (0, 1) of length n = I. If 
R ~0,we take the chain (1) of length n = 0. In both cases we obtain Sturm'' (^) — 0. 

This definition is effective in the sense that the Sturm index Sturm^ ( ^) can immediately 
be calculated. Definition 3.6 of theCauchy index Ind^(^), however, assumes knowledge of 
all roots of S in [a,b]. This difficulty is overcome by Sturm's celebrated theorem, equating 
the Cauchy index with the Sturm index: 

Tlieorem 3.20 (Sturm 1829/35, Cauchy 1831/37). ForR,S G R[X] we have 

(3.12) Ind^(l) =Sturm;^(|). 

Proof. Equation (3.12) is trivially true if 7? = or 5 = 0, according to our definitions. We 
can thus assume R,S G R[X]*. Let (50,51, . . . ,5„) be the euclidean Sturm chain associated 
to the fraction j. From Theorem 3.13 we know that 

M';(|)+ind;;(^)=y,f(5o,5„...,5„). 

Since |^ = f and 5^ = 1, the left hand side equals the Cauchy index Ind''(^). The right 
hand side equals the Sturm index Sturm^ (|) by definition. □ 

Remark 3.21. Sturm's theorem can be seen as an algebraic analogue of the fundamental 
theorem of calculus (or Stokes' theorem): it reduces a 1 -dimensional counting problem 
on the interval [a,b] to a 0-dimensional counting problem on the boundary {a,b}. We are 
most interested in the former, but the latter has the advantage of being easily calculable. 
Both become equal via the intermediate value property. In §4 we will generalize this to 
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the complex realm, reducing a 2-dimensional counting problem on a rectangle F to a 1- 
dimensional counting problem on the boundary dF. This can be further generalized to 
arbitrary dimension, leading to an algebraic version of Kronecker's index [15]. 

Remark 3.22. Sturm's theorem is usually stated under two additional hypotheses, namely 
gcd{R,S) — 1 and S{a)S{b) ^ 0. Our formulation of Theorem 3.20 does not require any 
of these hypotheses, instead they are absorbed into our slightly refined definitions. The 
hypothesis gcd{R,S) = 1 is circumvented by formulating Definitions 3.6 and 3.19 such 
that both indices become well-defined on R(X). The case S{a)S{b) = is anticipated in 
Definitions 3.1 and 3.4 by counting boundary points correctly. Arranging these details 
is not only an aesthetic preoccupation: it clears the way for a uniform treatment of the 
complex case in §4 and ensures a simpler algorithmic formulation. 

As an immediate consequence we obtain Sturm's classical theorem [54, §2]: 

Corollary 3.23 (Sturm 1829/35). For every polynomial P eR[X]* we have 

(3.13) #{xG[fl,Z7] |P(x)=0}=lnd;^(^) =Sturm*(^), 

where roots on the boundary count for one half. □ 

Remark 3.24. The intermediate value property is essential. Over the field Q of rational 
numbers, for example, the function /(x) = 2x/ (x^ — 2) has no poles, whence Indj (/) = 0. 
A Sturm chain is given by 5o — 2 and = 2X and ^2 = 2, whence Vl{So^S\^S2) = 1. 
Thus the Sturm index does not count roots resp. poles in Q but in the real closure Q'. 

Remark 3.25. By the usual bisection method. Formula (3.13) provides an algorithm to 
locate all real roots of any given real polynomial. Once the roots are well separated, one 
can switch to Newton's method (§7.3), which is simpler to apply and converges much faster 
-but vitally depends on good starting values. 

4. Cauchy's theorem for complex polynomials 

In this section we define the algebraic index ind(F|(9F) of a polynomial F e C[X] along 
the boundary of a rectangle F C C. We then establish Cauchy's theorem (Corollary 4.12) 
stating that ind(F|(9F) counts the number of roots of F in F. 

Remark 4.1. Nowadays the index is most often defined via Cauchy's integral formula 
ind(F|5F) = .far Tpy*^^- 1837 Cauchy [9] published the alternative, more elemen- 
tary formulation presented below. Here we develop an independent, entirely algebraic 
proof. The real product formula. Theorem 4.8, seems to be new. The complex product 
formula. Corollaries 4.10 and 4.12, are well-known in the analytic setting using Cauchy's 
integral, but the algebraic approach reveals two noteworthy extensions: 

• The algebraic construction is not restricted to the complex numbers C = M[/] but 
works for C ~ R[i] over an arbitrary real closed field R. 

• Unlike the integral formula, the algebraic index can cope with roots of F on the 
boundary dY, as pointed out in the introduction. 

4. 1 . Real and complex variables. Just as we identify {x,y) e R^ with z = x + e C, we 
consider C[Z] as a subring of C[X,F] with Z = X + iY . The conjugation on C extends to a 
ring automorphism of C[X,y] fixing X ~X and ? = F, so that the conjugate of Z = X + /F 
is Z = X — iY. In this sense X and Y are real variables, whereas Z is a complex variable. 

Each F G C[X,F] can be uniquely decomposed as F = R + iS with R,S G R[X,Y], 
namely R = reF :— j{F + F) and S = imF := j]iF ~ F). In particular we thus recover 
the familiar formulae X —leZ and Y = imZ. 

ForPeC[X,Y] we set F{P) := F(reP,imP). The mapC[X,}'] C[X,Y], F^F{P), 
is the unique ring endomorphism that maps Z\-^ P and is equivariant with respect to con- 
jugation, because P and Z^ P are equivalent to X i~> ref and Y ^ imP. 
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4.2. The complex index. We wish to define the index of polynomial paths in C. Given 
a,b e C, the map^: [0, 1] ^ C defined by £{x) = a +x(^ - a) joins £{0) = a and = b 
by a straight line segment. Its image will be denoted by [a,b\ = {a+x{b — a) | < .x: < 1 }. 

Definition 4.2. For F G C[X, 7] and a,;? e C we define 

mA{F\[a,b]):=\lnAl{^) where F := F {{b - a)X + a) . 

The map 7; [0, 1] — > C, 7(x) — F{x), describes a path in C. Assume for simplicity that 
Y{x) ^ for all X S [0, 1]. Then md{F\[a,b]) counts the number of turns that 7 performs 
around 0: the index changes by +^ each time 7 crosses the real axis in counter-clockwise 
direction, and by —5 if the passage is clockwise. We orient the line segment [a,b] from a 
to b; for the reverse orientation we obtain md{F\[b,a]) — —md{F\[a,b]). 

4.3. Rectangles. A rectangle is a subset F = [xo,xi] x [ycyi] in C = with xq < xi and 
yo < yi in R. Its interior is IntF = ]xo,xi [ x Jyojji [■ Its boundary dY consists of the four 
vertices a = (xo,yo)> ^ = {x\^ya)^c = {x\,y\),d = (xojji), and the four edges [a,b], [b,c\, 
[c,d\, [d,a\ between them (see Figure 1). 

Definition 4.3. Given a polynomial F G C[X,y] and a rectangle F C C, we define the 

algebraic index as ind(F|(9F) := ind(F| [fl,/7])+ind(F| [^,c]) +ind(F| [c,(i]) +ind(F| [c/,fl]). 

As z travels from a to to c to li and back to a along the boundary, F{z) describes 
a closed, continuous, piecewise polynomial curve, as illustrated in Figure 1. The index 
md{F\dY) counts the number of turns around 0, usually called the winding number. 

Our algebraic definition is slightly more comprehensive than the geometric one since it 
does not exclude roots on the boundary. 

Proposition 4.4 (bisection). Suppose that we bisect F = [xo,X2] x [}'o,3'2] 

• horizontally into F' = [xo,xi] x [}'o,3'2] andT" — [xi^X2\ x [3'0i}'2]. 

• or vertically into F' = [xo,X2] x [yojyi] andT" ~ [xo,X2\ x [3'i,y2] 

where xq < xi < X2 and yo <yi < yi- Then md{F\dr) = md{F\dV) +md{F\dr"). 

Proof. This follows from Definition 4.3 using one-dimensional bisection as in Proposition 
3.8(a) and (b) internal cancellation according to ind(F|[fe,fl]) = —md{F\[a,b]). □ 

Proposition 4.5. For a linear polynomial F — Z — zo with zo (z C we find 

if Zo is in the interior ofY, 
if Zo is in one of the edges ofT, 
if Zo is in one of the vertices ofT, 
if Zo is in the exterior ofT. 

Proof. By bisection, all configurations can be reduced to the case where zo is a vertex of F. 
By symmetry, translation, and homothety we can assume that zq = a ^ 0, b = I, c = I + i, 
d = i. Here an easy explicit calculation shows that 

mdiF\[aM) = ind(X| [0, 1]) = i Ind' (f ) = 0, 
ind(F|[/7,c]) =ind(l+;X|[0,l]) = ilnd^d) = i, 
ind(F|[c,c/]) =ind(l+/-X|[0,l]) = ilndi(i^) =0, 
ind(F|[t/,fl]) =ind(/-/X|[0,l]) = ^lndi(y^) =0, 

whence ind(F|(9F) = ^. □ 



ind(F|<9F) = < 



n 

I 
2 
I 
4 
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4.4. Piecewise polynomial loops. In this article we concentrate on rectangles because 
they support a simple bisection algorithm and are best suited to homotopy arguments. 
More generally, we can consider piecewise polynomial curves, or piecewise rational curves 
without poles, and essentially all statements carry over verbatim. More explicitly: 

Deflnition 4.6. Consider an interval [a,b] C R, subdivided by a = fo < fi < ■ ■ ■ < t„ ~ b, 

and let Gi , . . . ,G„ £ C[X] be polynomials that satisfy Gk{tk) = Gk+i [h) for A; = 1, ...,« — 1. 
This defines a continuous, piecewise polynomial curve j: [a,b\ ^ C by 7(f) :~ Gk{t) for 
t e [tk-\A]- Its index is defined by ind(7) :='L"k=\''^''^^[Gk\[tk-\A]])- 

Remark 4.7. For closed polynomial paths 7: [0, 1] C* Definition 4.6 provides the ex- 
plicit formulation of the definition/computation (10) stated in Theorem 1.2. Proposition 
4.5 establishes the normalization (II) for rectangles, which confirms that ind(7) counts the 
number of turns around in this basic case. The remaining properties, multiplicativity (12) 
and homotopy invariance (13), will be proven below. 

4.5. The product formula. The product of two polynomials F =P + iQ and G = R + iS 
with P, Q,R,S e R[X] is given by FG = {PR - QS) + i{PS + QR). The following result 

p n PR OS 

relates the Cauchy indices of ^ and ^ to that of js+qr- 

Theorem 4.8 (real product formula). For all P,Q,R,S £ R[X]* we have 

Remark 4.9. The fraction ^ + f = = ■ 'rJv^L, can be evaluated for all xeR 

Q S QS im(f)im(G) 

with the convention sign(oo) = 0. The correction term Vj' is a difference in the boundary 
points a and b, which will be important in the corollaries. Theorem 4.8 extends to the 
degenerate cases; If (P ^0,Q = 0) or (7? ^ 0,5 = 0), then Formula (4.1) trivially holds. If 
{P = Q,Q ^ 0) OT {R = Q,S ^ 0), then we obtain the inversion formula of Theorem 3.11. 
This also happens for {P = S,Q = R), which corresponds to F ^ —iG. 

Proof. We can assume that gcd{P,Q) = gcd{R,S) = 1. Suppose first that [a,b] does not 
contain any poles, that is, roots of Q, S, PS + QR. On the one hand, all three indices vanish 
in the absence of poles. On the other hand, the intermediate value property ensures that Q, 
S, and PS + QR are of constant sign on [a,b], whence V^f ( 1 , ) = 0. 

Suppose next that [a,b] contains at least one pole. Formula (4.1) is additive with respect 
to bisection of the interval [a,b]. We can thus assume that [a,b] contains only one pole. 
Bisecting once more, if necessary, we can assume that this pole is either a or b. Applying 
the symmetry Xi-^a + b~X,if necessary, we can assume that the pole is a. We thus 
have y^f = J sign(^ + ^ \ b) and Q, S, PS + QR are non-zero and of constant sign on ]a,b]. 
Applying the symmetry {P,Q,R,S) t-^ {P,—Q,R,—S), if necessary, we can assume that 
yj' = +j, which means that ^ + f > on ]a,b]. We distinguish three cases; 

First case. Suppose first that either Q{a) = or S{a) = 0. Applying the symmetry 
{P,Q,R,S) ^ {R,S,P,Q), if necessary, we can assume that Q{a) — and S{a) ^ 0. Then 
PS + QR does not vanish in a, whence Ind* (||^) Ind^ (f ) = 0. We have Um+ g = 
lim+(^ + f ) = +°°' whence Ind^(^) = +5 and Formula (4.1) holds. 

Second case. Suppose that PS + QR vanishes in a, but Q{a) ^ and S{a) ^ 0. Then 
Ind'^(^) = Ind'^(^) = 0, and we only have to study the pole of 



(4.2) 



PR-QS ^ ^j-^ 
PS + QR £ + 1 ■ 



In a the denominator vanishes and the numerator is negative; 

P(a) R(a) P(a) R(a) P^(a) 

t;H + t;H=0, whence ttH ■ ttH - 1 = -77^ - 1 < 0- 
Q{a) Sia) Qia) 5 a Q^ia) 
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This implies lim+ = -oo, whence lnd'^,{j§^) = -\ and Formula (4.1) holds. 

Third case. Suppose that a is a common pole of ^ and |, whence also of p^^gg ■ Since 
^ + § > on ]a,b], we have limj" ^ = +°° or limj" | = +°°. Equation (4.2) implies that 
lim+(pf^) +lim+(|) -lim+d). In each case Formula (4.1) holds. □ 

Corollary 4.10 (complex product formula). If F,G e C[X,}'] do not vanish in any of the 
vertices of the rectangle F C R^, then ind{F ■ G\dr) = ind{F\dr) +ind(G|(9r). 

Proof. This follows from the real product formula of Theorem 4.8 and the fact that the 
boundary dF forms a closed path. By excluding roots on the vertices we ensure that at 
each vertex both boundary contributions cancel each other □ 

Remark 4. 1 1. The same argument applies to the product of any two piecewise polynomial 
loops 7i , 72 : [0,1]^ C, provided that vertices are not mapped to 0. For 7i , 72 : [0,1]^ C* 
this proves the multiplicativity (12) stated in Theorem 1.2: ind(7i • 72) = ind(7i) + ind(72). 

Corollary 4.12 (root counting). Consider a polynomial F £ C[Z]* that splits into linear 
factors, such that F ~ c{Z — ) • • • (Z — Zn) for some c,zi , . . . ,z„ S C. If none of the roots 
lies on a vertex ofT, then md{F\dT) counts the number of roots in F as in Theorem 1.9. □ 

Remark 4.13. In the preceding corollaries we explicitly exclude roots on the vertices. 
While the degree 1 case of Proposition 4.5 is easy (and useful in the proof) there is 
no such simple rule in degree > 2. As an illustration consider F = [0, 1] x [0, 1] and 
/v = Z • (Z — 2 — if): after a little calculation we find md{F\\dT) = and ind(Fo|<5r) = ^ 
and ind(F_i |(9F) — j. This shows that in this degenerate case the index depends on the 
configuration of all roots and not only on the number of roots in F. We will not further 
pursue this question, which is only of marginal interest, and simply exclude roots on the 
vertices. We emphasize once again that roots on the edges pose no problem. 

Remark 4.14. If we assume that C is algebraically closed, then every polynomial F e C[Z] 
factors as required in Corollary 4.12. So if you prefer some other existence proof for the 
roots, then you may skip the next section and still benefit from root location (Theorem 
1.12). This seems to be the point of view adopted by Cauchy [8, 9] in 1831/37, which may 
explain why he did not attempt to use his index for a constructive proof of the Fundamental 
Theorem of Algebra. (In 1820 he had already given a non-constructive proof, see §8.8.1.) 
In 1836 Sturm and Liouville [57, 55] proposed to extend Cauchy 's algebraic method for 
root counting so as to obtain an existence proof. This is our aim in the next section. 

5. The Fundamental Theorem of Algebra 

We continue to consider a real closed field R and its complex extension C = R[/] where 
= — 1. In the preceding sections we have constructed the algebraic index ind(F|(9F) for 
F G C[Z]* and F C C, and derived its multiplicativity. We can now establish our main 
result: an effective, real-algebraic proof of the Fundamental Theorem of Algebra. 

Remark 5.1. The proof that we present here is inspired by classical arguments, based on the 
winding number of loops in the complex plane. The idea goes back to Gauss' dissertation 
(see §8.2) and has been much elaborated since. For C = R[/] over a real closed field R, our 
algebraic approach strips the proof to its bare essentials. In this setting the algebraic proof 
of Theorem 5.3 seems to be new. 

5.1. The index in the absence of zeros. The crucial step is to show that md{F\dT) ^ 
implies that F has a root in F. By contraposition, we will show that ind(F|(9F) = 
whenever F has no zeros in F. The local version is easy: 

Lemma 5.2 (local version). IfF G C[X, F] satisfies F{x,y) ^ for some point {x,y) G R^, 
then there exists 5 >0 suchthatmd{F\dT) =OforeveryT C [x — 5,x + 5] x [y — 5,y + 5]. 
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Proof. Let us make the standard continuity argument explicit. For all s,t G R we have 

F{x + s,y + 1) ~ a + Y.i+k>\ OjkS-'t'^ with a = F{x,y) ^ and certain coefficients aji^ G C. 
We setM := max J+-l/\aJiJa\, so that \aji,\ < \a\ ■M^+''. For 5 := 4^ and |r| < 5 we find 



(5.1) a,,s¥ <^ ^ \a\-MJ+'^ -Isy -Itl' <\a\Y^{n + l){\)" = 1 

j+k>l H>1 j+k=n n>l 

This allows us to define the desired neighbourhood U := [x— 5,x + 5] x [y — 5,^ + 5]. 
The Estimate (5.1) shows that F does not vanish on U. Corollary 4.10 ensures that 
ind(F|^r) = ind(cF|(9r) for every rectangle F C U and every constant c G C*. Choos- 
ing c = i/a we can assume that F{x,y) — i. The Estimate (5.1) then shows that ivaF > 
on U, whence ind(F|(9r) = for every rectangle F cU. □ 

While the preceding local lemma uses only continuity and holds over every ordered 
field, the following global version requires the field R to be real closed. 

Theorem 5.3 (global version). Let F ~ [xo,xi] x [yoiji] be a rectangle in C. IfF G C[X,F] 
satisfies F{x,y) 7^ 0/or all {x,y) G F, then md{F\dF) = 0. 

We remark that over the real numbers M, a short proof can be given as follows: 

Compactness proof. The rectangle F = [xo,xi] x [ycyi] is covered by the family of open 
sets U{x,y) = ]x — 8, x + 8[x ]y — 5, y + 5[ of Lemma 5.2, where 5 depends on {x,y). 
Compactness of F ensures that there exists A > 0, called a Lebesgue number of the cover, 
such that every rectangle F' C F of diameter < /I is contained in some U{x,y). For all sub- 
divisions xo~so<si<---< s,„ = Xl and yo = fo < fi < ■ ■ ■ <tn=yi, the bisection property 
ensures that md{F\dF) = LyLi L^=i ind(F|5Fji) where F^j. = x [fi-ij^]. For 

Sj = Xo + 7 '^'^,'^° and fit = yo + with m,n sufficiently large, each Fj^ has diameter 

< A, so Lemma 5.2 implies that md{F\dFjk) ~ for all j,k, whence md{F\dF) ~Q. □ 

The preceding compactness argument applies only to the field C = R[/] of complex 
numbers over M (§2.1) and not to an arbitrary real-closed field (§2.2). In particular, it is 
no longer elementary in the sense that it uses a second-order property (§2.3). We therefore 
provide an elementary real-algebraic proof using Sturm chains: 

Algebraic proof Each F eC[X,Y] can be written as F ^ L'Lofk^'' with fk G C[F]. In 
this way we consider R[X, 7] = R[Y] [X] as a polynomial ring in one variable X over R[Y] . 
Starting with Sq,Si G R[X,}'] such that |^ = pseudo-eucUdean division in R[y][X], 
as explained in Remark 3.18, produces a chain {So, - ■ ■ ,S„) such that S^+i = — c^5'/^_i 
for some G R[}'][X] and q G R[y]*. We have deg^^Sic+i < deg^S^ so we end up with 
Sn+i = and S„ G R[Y]* for some n. (If deg;^ 5„ > 0, then gcd(5o,5i) in R{Y)[X] is of 
positive degree and we can reduce the initial fraction j^.) 

Regular case. Assume first that S,, does not vanish in [ycyi]- Proposition 3.14 ensures 
that specializing {Sq, . . . ,Sn) in 7 1— *■ G [yoiJi] yields a Sturm chain in R[X], and likewise 
specializing (So, . . . ,5„) in X 1— > x G [xo,xi] yields a Sturm chain in R[i']. In the sum over 
all four edges of F, all contributions cancel each other in pairs: 

2ind(F|aF) = + IndJi(^ I y = vo) +ind;:i(i|^ I X =xO 

+ In^?(^|y=yi)+Ind;:o(i|^|X=xo) 

= + y^i(5o,...,5„ |y=3'o)+V^^(5o,...,5„ |X=xi) 

+ V,'^ {So, ...,S„\Y =yi)+ \a;;o (^q, ...,S„\X ^xq) ^o. 

Singular case. In general we have to cope with a finite set C [yo,yi] of roots of S„. 
We can change the roles of X and Y and apply the euclidean algorithm in R[X][y]; this 
leads to a finite set of roots C [xo,xi]. We obtain a finite set iF = x of singular 
points in F, where both chains fail. (These points are potential zeros of F.) 
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Figure 2. Isolating a singular point (xcyo) within Y = [xo,xi] x [yoiji] 

By subdivision and symmetry we can assume that (xcyo) is the only singular point in 
our rectangle Y = [xq^x{\ x [ycyi]. By hypothesis F does not vanish in {xQ^yo), so we 
can apply Lemma 5.2 to Fi = [xo,xo + 5] x [ycyo + S\ with 5 > sufficiently small such 
that ind(F|(9Fi) = 0. The remaining three rectangles F2 ~ [xo,xo + 5] x [yo + 5,^1] and 
F3 = [xo + 5,xi] X [yo7}'o + ^] andF4 = [xo + 5,xi] x [yo + Sjji] do not contain any singular 
points, such that md{F\dY j) = by appealing to the regular case. 

Summing over all sub-rectangles we conclude that ind(F|(9F) =0. □ 

5.2. Counting complex roots. The following result generalizes the real root count (§3.3) 
to complex roots. 

Theorem 5.4. Consider a polynomial F £ C[Z]* and a rectangle F C C such that F does 
not vanish in the vertices ofY. Then mA[F\dY) counts the number of roots ofF in Y. Roots 
on the boundary count for one half. 

Proof. We can factor/^ = (Z — zi) • • • {Z~Zm)G such that G e C[Z]* has no roots in C. The 
assertion follows from the product formula of Corollary 4.10. Each linear factor (Z — Zk) 
contributes to the index as stated in Proposition 4.5. The factor G does not contribute to the 
index according to Theorem 5.3. (We will prove below that m = degF and G E C*.) □ 

We have focused on polynomials F G C[Z], but Definition 4.3 of the index and the 
product formula of Corollary 4.10 immediately extend to rational functions F G C(Z). It 
is then an easy matter to establish the following generalization: 

Corollary 5.5. Consider a rational function F G C(Z) and a rectangle F C C such that 
the vertices ofY are neither roots nor poles of F. Then mA{F\dY) counts the number of 
roots minus the number of poles ofF in Y. Boundary points count for one half. □ 

5.3. Homotopy invariance. We wish to show that the Cauchy index ind(/v |(9F) does not 
change if we deform Fq to F\ . To make this precise we consider F G C[r,Z] and denote by 
F, the polynomial in C[Z] obtained by specializing T 1^ t G [0, 1]. 

Theorem 5.6. Suppose that F G C[r,Z] is such that for each t G [0,1] the polynomial 
F, G C[Z] has no roots on dY. Then md{Fo\dY) = ind(Fi \dY). 

Proof. Over each edge [a,b] of F we consider the rectangle Fi = [0,1] x [a,b]. In the 
absence of zeros. Theorem 5.3 ensures that md{F\dYi) = 0, that is, 

ind(F|{0} X [a,b])-md{F\{l} x [a,b]) ^ md{F\[0,l] x {a}) - ind(f |[0, 1] x {b}). 

In the sum over all four edges of F the terms on the right hand side cancel each other in 
pairs. We conclude that ind(Fo|5F) - ind(Fi \dY) =0. □ 

Remarks.!. The proofholds for every piecewise polynomial homotopy//: [0, 1] x [0, 1] 
C* where 7, = //(f , — ) : [0, 1] ^ C* is a closed path for each t G [0, 1]. This proves the ho- 
motopy invariance (13) stated in Theorem 1.2: ind(7o) = ind(7i). 
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5.4. The global index of a polynomial. Having all tools in hand, we can now prove 
Theorem 1.11, stating that ind(F|^r) = degF for every polynomial F G C[Z]* and every 
sufficiently large rectangle F. This can be quantified by Cauchy's bound: 

Definition 5.8. For F = c„Z" + c„_iZ""' H hco in C[Z] with c„ ^ we set M = 

max{0, |co|, . . . , |c„_i 1} and define the Cauchy radius to be Pf := 1 +M/\c„\. 

Proposition 5.9. Ifz G C satisfies \z\ > Pf, then \F{z) \ > \c„ \ > 0. Hence all roots ofF in 
C are contained in the Cauchy disk B{pf) = {z € C | |z| < Pf}- 

Proof. The assertion is true for F = c„Z" where M = and pp — I- In the sequel we can 
thus assume M > and p/r > 1. For all z G C satisfying |z| > pp we find 

|F(z)-c„z"| = |co + ciz+... + c„_iz"-'|<|co| + |ci||z| + ... + |c„_i||z"-'| 
<M + M|z| + ---+M|z|"-' < |c„|(|z|"-l). 

For the last inequality notice that |z| >Pf implies |z| — 1 > Pf — ^ = M /\c,j\. We have 

|c„z"| = |c„z" -F(z) +F(z)| < \cnf~F{z)\ + \F{z)l 
whence \F{z)\ > \c„z"\ - \F{z)-c„z"\ > \c„\\z\" - \c„\{\z\" - 1) = |c„| > 0. □ 

This proposition is not an existence result but only an a priori bound; it says that if 
F has roots in C, then each such root necessarily lies in B(pf ). Now the complex index 
allows us to count all roots of F in C and to establish the desired conclusion: 

Theorem 5.10. For every polynomial F G C [Z] * and every rectangle F C C containing the 
Cauchy disk B{pf) we have ind(F|^F) = degF. 

Proof. Given a polynomial F = c„Z" + c„_ iZ"^ ' H h cq with c„ 7^ we deform Fi ^ F 

to Fq = c„Z" via F, = c„Z" +f (c„_iZ"-' H hco). For each t G [0, 1] the Cauchy radius of 

Ft is p, = 1 +tM/\cn\, which shrinks from pi = Pf to po = 1. By the previous proposition, 
the polynomial Ft G C[Z] has no roots on dF. We can thus apply Theorems 5.6 and 5.4 to 
conclude that ind(Fi |5F) = ind(Fol^r) n. □ 

This completes the proof of the Fundamental Theorem of Algebra: on the one hand 
Theorem 5.10 says that ind(F|(9F) = degF provided that F D B(pf), and on the other 
hand Theorem 5.4 says that ind(F|(9F) equals the number of roots of F in F C C. 

6. Further applications of the algebraic index 

Beyond the Fundamental Theorem of Algebra the algebraic index is a versatile tool in 
many situations. This section is an excursion highlighting two beautiful applications: the 
Routh-Hurwitz stability theorem and Brouwer's fixed point theorem. 

6.1. The Routh-Hurwitz stability theorem. In certain applications it is important to 
determine whether a given polynomial F G C[Z] has all of its roots in the left half plane 
*Cre<o = {~ S C I re(z) < 0}. The origin of this question is the theory of dynamical systems 
and the problem of stability of motion: 

Example 6.1. Let A G M"^" be a square matrix with real coefficients. The differential 
equation y' — Ay with initial condition ^(0) = yo has a unique solution /: R — > R" given 
by f{t) = exp(M)yo- In terms of dynamical systems, the origin a = is a fixed point; it 
is stable if all eigenvalues , . . . , A„ G C of A satisfy reXk < 0: in this case exp(fA) has 
eigenvalues exp(fAi.) of absolute value < 1. The matrix exp(fA) is thus a contraction for 
all f > 0, and every initial value is attrated to a = 0, i.e., /(f) for t +°o. 
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Example 6.2. The previous argument holds locally around fixed points of any dynamical 
system given by a differential equation / ~ 't'(y) where <t>: M" ^ M" is continuously dif- 
ferentiable. Suppose that a is a fixed point, i.e., ^{a) = 0. It is stable if all eigenvalues of 
the matrix A = S M."^" have negative real part: in this case there exists a neighbour- 

hood y of fl that is attracted to a: every trajectory / : R>o M", satisfying /'(?) = <t>(/(f )) 
for all t > 0, starting at /(O) £ V satisfies /(f) ^ a for r ^ +oo. 

Given F gC[Z] we can determine the number of roots with positive real part simply by 
considering the rectangle F = [0,r] x [— r, r] and calculating ind(F|(9r) for r sufficiently 
large. (See the Cauchy radius Pf defined in §5.4.) Routh's theorem, however, offers a 
simpler solution by calculating the Cauchy index along the imaginary axis. This is usually 
proven using complex integration, but here we will give a real-algebraic proof. As usual 
we consider a real closed field R and its extension C = R[/] with = — 1. 

Definition 6.3. For every polynomial F S C[Z]* we define its Routh index as 

(6.1) Routh(F) Ind-(^) +lnd+;/:(^) 

for some arbitrary parameter r G R>o; the result is independent of r by Proposition 3.8(b). 
Remark 6.4. We can decompose F{iY) ~ R + iS with R,S E R[Y] and consider the degrees 

m = deg5 and n = degR. If m > n, then the fraction j^jjY) ~ y"'f(i/y) ^'^^ "° P°^^ ^' 
the second index vanishes for r sufficiently large, and Equation (6.1) simplifies to 

(6.2) Routh(F) = -Indt:(^). 

Example 6.5. In general the correction term at °° cannot be neglected, as illustrated by 

F = (Z - 1 ) (Z - 2): here F(iY) = -Y^ - 3/7 - 2, whence = and i^M^ = 

\ /\ / \ / ' iraF(iF) 3F imF{i/Y) 

i-^Y^. Both indices in Equation (6.1) contribute +1 such that Routh(F) = +2. 
Lemma 6.6. We have Routh(Z — zq) ~ s,ign{iezo) for all zo G C. 

Proof. For F = Z — zq we find F{iY) ~ R + iS with R ~ — rezo and S = Y — imzo. Thus 
Routh(F) = -lndt:(f ) = Indt:(5^) = sign(rezo). □ 

Lemma 6.7. We have Routh(FG) = Routh(F) + Routh(G)/or a// F,G e C[Z]*. 

Proof. This follows from the real product formula stated in Theorem 4.8. □ 

Remark 6.8. For every c G C* we have Routh(c) = 0, whence Routh(cF) = Routh(F). 
This can be used to ensure the favourable situation of Remark 6.4, where S = imF{iY) has 
at least the same degree as R = reF{iY). If degS < degR, then it is advantageous to pass 
to iF, that is, to make the replacement {R,S) ^ {—S,R). 

We can now deduce the following formulation of the famous Routh-Hurwitz theorem: 

Theorem 6.9. The Routh index of every polynomial F G C[Z]* satisfies Routh{F) = p — q 
where p resp. q is the number of roots ofF in C having positive resp. negative real part. 

Proof. The Fundamental Theorem of Algebra ensures that F = c(Z — z\) ■ ■ ■ (Z — Zn) for 
some c G C * and z i , . . . , z„ G C . For every split polynomial the Routh index formula follows 
from the preceding lemmas. □ 

This criterion is often applied to real polynomials P G R[^], as in the motivating exam- 
ples above, which warrants the following more detailed formulation: 

Corollary 6.10. Let /" = cq + c\X + • • • + c„X" be a polynomial of degree n over R, and 
let p resp. q be the number of roots ofP in C having positive resp. negative real part. Then 

{ — Ind+!! ( ) ifn is odd, 

(6.3) p-q = • 



-\ndZ„\-^^^^) if n IS even. 
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Both cases can be subsumed into the unique formula 



q-p = Ind;„ — 

V CnX" - Cn-: 




) 



This implies Routh 's criterion: All roots ofP have negative real part if and only if q — n 
and p = 0, which is equivalent to saying that the Cauchy index in (6.4) evaluates to n. 

Routh's formulation via Cauchy indices is unrivaled in its simplicity, and can immedi- 
ately be calculated using Sturm's theorem (§3.7). Hurwitz' formulation uses determinants, 
which has the advantage to produce explicit polynomial formulae in the given coefficients 
of P. For a detailed development see Gantmacher [18, chap. 15], Henrici [23, §6.7], Mar- 
den [35, chap. IX], or Rahman-Schmeisser [42, chap. 11]. 

6.2. Brouwer's fixed point tlieorem. Brouwer's theorem states that every continuous 
map /: [0, 1]" [0, 1]" of a cube in M" to itself has a fixed point. While in dimension 
« = 1 this follows directly from the intermediate value theorem, the statement in dimen- 
sion « > 2 is much more difficult to prove: one employs either sophisticated machinery 
(differential topology. Stokes' theorem, co/homology) or subtle combinatorial techniques 
(Sperner's lemma, Nash's game of Hex). All proofs use Brouwer's mapping degree, in a 
more or less explicit way, and the compactness of [0, 1]" plays a crucial role. Such proofs 
are often non-constructive and do not address the question of locating fixed points. 

Using the algebraic index we can prove Brouwer's theorem in a constructive way over 
real closed fields, restricting the statement from continuous to rational functions: 

Theorem 6.11. Let ^ be a real closed field and let P,Q G R{X,Y) be rational functions. 
Assume that f , Q have no poles in F = [xo,xi] x [yoiJl]. that they define a map f: F — > 
by f{x,y) = {P{x,y),Q{x,y)). If f{T) C F, then there exists z £ F such that f{z) = Z- □ 

Proof. The essential properties of the algebraic index stated in Theorem 1.2 extend to 
rational functions without poles. By translation and homothety we can assume that F = 
[— 1, + 1] X [— 1, + 1]. We consider the homotopy gt = id— tf from = id to gi = id—/. 
For z G <9F we have g,{z) = if and only if f = 1 and f{z) = z; in this case the assertion 
holds. Otherwise, we have gt{z) ^ for all z G <3F and t G [0, 1]. We can then apply 
homotopy invariance to conclude that ind(gi \dr) ~ ind(gol^r) = 1. Theorem 5.3 implies 
that there exists z G IntF such that gi (z) = 0, whence /(z) — z. □ 

Remark 6.12. As for the Fundamental Theorem of Algebra, the algebraic proof of Theorem 
6.11 also provides an algorithm to approximate a fixed point to any desired precision. 
Here we have to assume the ordered field R to be archimedean, or equivalently R C K. 
Beginning with Fq = [— x [— 1, + 1] and bisecting successively, we can construct a 
sequence of subsquares F = Fq D Fi D • • • D F^. such that / has a fixed point on dF/t or 
ind(id — /|(3Fj^) 0. In the first case, a fixed point on the boundary dFi^ is signalled during 
the calculation of ind(id— /|(9Fj^) and leads to a one-dimensional search problem. In the 
second case, we continue the two-dimensional approximation. 

Remark 6.13. Tarski's theorem says that all real closed fields share the same elementary 
theory (see Remark 2.6). This implies that the statement of Brouwer's fixed point theorem 
generahzes from the real numbers R to every real closed field R: as formulated above it is a 
first-order assertion in each degree. It is remarkable that there exists a first-order proof over 
R that is as direct as the usual second-order proof over M. In this article we concentrate on 
dimension « — 2, but the algebraic approach generalizes to any finite dimension [15]. 

Remark 6.14. Over the field R of real numbers the algebraic version implies the continuous 
version as follows. Since F = [—1,4-1] x [— 1, + 1] is compact, every continuous function 
/: F ^ F can be approximated by polynomials g„: F — > such that — /| < ^. The 
polynomials /„ = j;^8n satisfy /„(F) C F and |/n — /| < ^. For each n there exists Zn G F 
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such that fn{zn) = Zn according to Theorem 6.11. Again by compactness of Y we can 
extract a convergent subsequence. Assuming z„ z, we find 

\f{z) -Z\< \f{z) - f{Zn)\ + \f{Zn) ^ /„ {Zn)\ + \Zn " z| - 0, 

which proves f{z) = z. 

7. Algorithmic aspects 

The preceding development shows how to derive Cauchy's algebraic method for locat- 
ing the roots of a complex polynomial, and this section discusses algorithmic questions. 

Remark 7.1. The algorithm described here is often attributed to Wilf [67] in 1978, but 
it was already explicitly described by Sturm [55] and Cauchy [9] in the 1830s. It can 
also be found in Runge's Encyklopddie article [36, Band 1, §1-B3a6] in 1898. Numerical 
variants are known as Weyl's quadtree method (1924) or Lehmer's method (1969), see 
§8.9. I propose to call it Cauchy's method, or Cauchy's algebraic method if emphasis is 
needed to differentiate it from Cauchy's analytic method using integration. For the theory 
of complex polynomials see Marden [35], Henrici [23], and Rahman-Schmeisser [42]; the 
latter contains extensive historical notes and an up-to-date guide to the literature. 

7.1. Turing computability. The theory of ordered or orderable fields, nowadays called 
real algebra, was initiated by Artin and Schreier [3, 4] in the 1920s, culminating in Artin's 
solution [ 1] of Hubert's 17th problem. Since the 1970s real-algebraic geometry is flourish- 
ing anew, see Bochnak-Coste-Roy [7], and with the advent of computers algorithmic and 
quantitative aspects have regained importance, see Basu-PoUak-Roy [5]. Sinaceur [51] 
presents a detailed history of Sturm's theorem and its multiple metamorphoses. 

Definition 7.2. We say that an ordered field (R, +,-,<) can be implemented on a Turing 
machine if each element a G R can be coded as input/output for such a machine and each 
of the field operations {a,b) ^ a + b, a ^ —a, {a,b) ^ a ■ b, a ^ a^' as well as the 
comparisons a ^ b, a < b can be carried out by a uniform algorithm. 

Example 7.3. The field (M, +,-,<) of real numbers cannot be implemented on a Turing 
machine because the set M is uncountable: it is impossible to code all real numbers by finite 
strings over a finite alphabet, as required for input/output. This argument is independent 
of the chosen representation. If we insist on representing each and every real number, then 
this fundamental obstacle can only be circumvented by considering a hypothetical real 
number machine [6], which transcends the traditional setting of Turing machines. 

Example 7.4. The subset Mcomp C M of computable real numbers, as defined by Turing 
[60] in his famous 1936 article, forms a countable, real closed subfield of R. Each com- 
putable number a can be represented as input/output for a universal Turing machine by 
an algorithm that approximates a to any desired precision. This overcomes the obsta- 
cle of the previous example by restriction to Mcomp- Unfortunately, not all operations of 
(Kcomp, +,■,<) can be implemented: there exists no algorithm that for each computable 
real number a, given in form of an algorithm, determines whether a = 0, or more generally 
determines the sign of a. (This is an instance of the notorious Entscheidungsproblem.) 

Example 7.5. The algebraic closure of Q in M is, by definition, a real closed field; it 
is the smallest real closed field in the sense that it is contained in every real closed field. 
Unlike the field Mcomp of computable real numbers, the much smaller field (Q*^, +,-,<) 
can be implemented on a Turing machine [46, 45]. 

7.2. A global root finding algorithm. We consider a complex polynomial 

F = CQ + ciZ^ hc„Z" in C[Z] 

that we assume to be implementable, that is, we require the ordered field 
Q(re(co),im(co),re(ci),im(ci), . . . ,re(c„),im(c„)) C K 
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to be implementable in the preceding sense. We begin with the following preparations: 

• We divide F by gcd(F, F') to ensure that all roots of F are simple. 

• We determine r e N such that all roots of F are contained in the disk B{r). 

The following notation will be convenient: a Q-cell is a singleton {a] with a G C; a 
\-cell is an open line segment {xq} x ]yQ,y\ [ or ]xo,xi [ x {yo] with < x\ and yo < yi in 
M; a is an open rectangle ]xq,xi [ x ]yQ,y\ [ in C. 

It is immediate to check whether a 0-cell contains a root of F . Sturm's theorem (Corol- 
lary 3.23) allows us to count the roots in a 1-cell. Cauchy's theorem (Corollary 4. 12) allows 
us to count the roots in a 2-cell. In both cases the essential subalgorithm is the computation 
of Sturm chains for R/S, which we will discuss in §7.4 below. Building on this, the root 
finding algorithm successively refines a list Lj = {Fi, . . . ,r,y} of disjoint cells such that: 

• Each root of F is contained in exactly one cell F € Lj. 

• Each cell F G Lj contains at least one root of F. 

• Each cell F G Lj has diameter < 3r ■ 2^^ . 
More explicitly, the algorithm proceeds as follows: 

We initialize Lq = {F} with the square F = ]— r, +r[ x ]— r, 

Given Lj we construct Lj+i by treating each cell F G Lj as follows: 

(0) If F is a 0-cell, then retain F. 

(1) If F is a 1-cell, then bisect F into two 1 -cells of equal length. 
Retain each new 1-cell that contains a root of F . 

Retain the new 0-cell if it contains a root of F. 

(2) If F is a 2-cell, then bisect F into four 2-cells of equal size. 
Retain each new 2-cell that contains a root of F . 

Retain each new 1-cell that contains a root of F . 

Retain the new 0-cell if it contains a root of F. 
Collecting all retained cells we obtain the new list Lj+\. After some initial iterations 
all roots will lie in disjoint cells Fi, . . . ,F„, each containing precisely one root. Taking the 
midpoint u^. G Fj^, this can be seen as n approximate roots ui,... ,u„ each with an error 
bound 5k < ^r- 2^-' such that each Uk is S^^-close to a root of F. 

7.3. Cross-over to Newton's local method. For F e C[Z] Newton's method consists in 
iterating the map <I>: C \ 2f{F') C given by <t>(z) = z — F{z)/F'{z). Its strength resides 
in the following well-known property: 

Theorem 7.6. The fixed points of Newton 's map <J> are the simple zeros ofF, that is, zo (zC 
such that F{zo) =0 and F'{zo) 0. For each fixed point zo there exists 5 > such that 
every initial value UQ £ B(zQ, 5) satisfies \^"{uo)—zq\ < 2^^^" • |mo— zo|/o''fl'^n G N. □ 

The convergence is thus extremely fast, but the main obstacle is to find sufficiently good 
approximations mq ~ zo as starting values. Our global root finding algorithm approximates 
all roots simultaneously, and the following simple criterion exploits this information: 

Proposition 7.7. Let F G C[Z] be a separable polynomial of degree n. Suppose we have 
separated the roots in disjoint disks B{uk, 5k) for k ~ I, . . . ,n such that 

3n5k < \uk — Uj\ for all j ^ k. 

Then Newton's algorithm converges for each starting value u^ to the corresponding root 
Zk G Biuk, 5k). More precisely, convergence is at least as fast as 

\^''{uk)-Zk\ <2-"\uk-Zk\ for all new 

The hypothesis can be verified directly from the approximations {uk,5k)k=i n pro- 
duced by the global root finding algorithm of §7.2. Newton's method eventually converges 
much faster, and Proposition 7.7 only shows that right from the start Newton's method is 
at least as fast as bisection. 
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Proof. ForF = [Z - zi) ■ ■ ■ iZ - z„) we have F'/F = -^i)"'- This entails 4>(z) 

^ - 1/I"=i - Z;)"', provided that F{z)^Q and F'{z) ^ 0, whence 



Z-Zj 



7—7, V" 1 I V ^^^k 



By hypothesis we have approximate roots mi , . . . , m„ such that Im^ ^ 2i| < S^^- Consider 
z G B{zk,dii), which entails \z — Mi:| < 25^^. The inequality 3«5^. < Imj; — Uj\ for all y 7^ A: 
implies (3n — 3)5^^ + 25^ + 5,' < \uk — uj \ and thus 



\z - Zj\ > \uk ~uj\- 25k - 8j > (3« - 3)5, for all ; ^ k. 
This ensures that \'^\ < - 5,;;^, whence 'f^ \ < Lj^k\l^ | < ^ and 

'^(z)-Zk 



Z-Zk 



< 









1- 





2 



This shows that |4>"(7)-7^| <2 "Iz-z,] for all z e 5,) and all « e N. In particular 
this holds for the starting value z = in ^(z,, 5,). □ 

As an alternative to our tailor-made Proposition 7.7, the following theorem of Smale 
[6, chap. 8] provides a general convergence criterion in terms of local data. It applies in 
particular to polynomials, where it is most easily implemented. 

Theorem 7.8 (Smale 1986). Let f: CD U be an analytic function. Consider uq GU 
such that /'(mo) 7^ 0, and let rj = \ f{uo)/ f'{uQ)\ be the initial displacement in Newton's 
iteration. Suppose that f{z) ~ Y,'k=Q''k{z — uq)'' for all z G B{uo,2rj). If 

Iq I < (8J?)^"'- |ci| for all k> 2, 

then f has a unique zero zq in B{uQ,2r\), and Newton 's iteration converges as 

\^"{uo)-zo\ <2'"2"-|mo-zo| for all n en. 

7.4. Cauchy index computation. In this section we briefly consider bit-complexity. To 
simplify we shall work over the rational numbers Q. For G Q[X], with gcd(7?,5) = 1 
say, we wish to calculate a Sturm chain Sq^ S,S\ = R^. . . ,S„ = I, Sn+\ = such that 



(7.1) ai^Sk-i+bkSk+i^QkSk with eQ[X]andfl,,/7,eQ+. 

Applying the usual euclidean algorithm to polynomials of degree < «, this takes 0{n^) 
arithmetic operations in Q. This over-simplification, however, neglects the notorious prob- 
lem of coefficient swell, which plagues naive implementations with exponential running 
time. This difficulty can be overcome replacing the euclidean remainder sequence by sub- 
resultants, which were introduced by Sylvester [58]. Habicht [22] systematically studied 
subresultants and used them to construct Sturm chains whose coefficients are polynomial 
functions in the input coefficients, and not rational functions as given by euclidean divi- 
sion. Subresultants have become a highly developed tool of computer algebra; we refer to 
Gathen-Gerhard [19, chapters 6 and 11] and the references cited therein. This should be 
taken into account when choosing or developing a library for polynomial arithmetic. 

Tlieorem7.9. Let F ^ c„Z" + c„_iZ"-'^ -\ hciZ + co be a polynomial of degree n with 

Gaussian integer coefficients such that |rec,| < 2" and \ imc,| < 2" for all k = Q, . . . ,n. 
Suppose that all roots ofF lie in the disk B{r). The above root finding algorithm determines 
all roots ofF to a precision 3r/2* requiring d{n^b{a + nb)) bit-operations. □ 
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Proof. Suppose that 5 G Z[X] are of degree < n and all coefficients are bounded by A = 
2" . According to Lickteig-Roy [32] and Gathen-Gerhard [19, Cor. 11.17] the subresultant 
algorithm requires d{n^a) bit-operations. This has to be iterated b times; coefficients are 
bounded by A = 2"+"^. Since we assume all roots to be distinct, they ultimately become 
separated so that the algorithm has to follow n approximations in parallel. This multiplies 
the previous bound by a factor nb, so we arrive at d{ir'b{a + nb)) bit-operations. □ 

7.5. What remains to be improved? Root-finding algorithms of bit-complexity d{n^{n + 
b)) are the world record since the ground-breaking work of Schonhage [49] in the 1980s. 
Cauchy's algebraic method is of complexity d{n^b^) and thus comes close, but in its cur- 
rent form it remains one order of magnitude more expensive. Schonhage remarks: 

It is not clear whether methods based on Sturm sequences can possibly 
become superior. Lehmer [31] and Wilf [67] both do not solve the ex- 
tra problems which arise, if there is a zero on the test contour (circle or 
rectangle) or very close to it. [49, p. 5] 

Notice that we have applied the divide-and-conquer paradigm in the arithmetic subal- 
gorithms, but not in the root-finding method itself In Schonhage's method this is achieved 
by approximately factoring F of degree n into two polynomials F\ , F2 of degrees close to 
|. It is plausible but not obvious that a similar strategy can be put into practice in the 
algebraic setting. Some clever idea and a more detailed investigation are needed here. 

Our development neatly solves the problem of roots on the boundary. Of course, ap- 
proximating the roots of a polynomial F G C[Z] can only be as good as the initial data, 
and we therefore assume that F is known exactly. This is important because root-finding 
is an ill-conditioned problem, see Wilkinson [68]. Even if exact arithmetic can avoid this 
problem during the computation, it comes back into focus when the initial data is itself 
only an approximation. In this more general situation the real-algebraic approach requires 
a detailed error analysis, ideally in the setting of interval arithmetic and recursive analysis. 

7.6. Formal proofs. In recent years the theory and practice of formal proofs and computer- 
verified theorems has become a fully fledged enterprise. A prominent and much discussed 
example is the Four Colour Theorem, see Gonthier [21]. The computer- verified proof com- 
munity envisages much more ambitious projects, such as the classification of finite simple 
groups. See the Mathematical Components Manifesto by Gonthier, Werner, and Bertot at 
www.msr-inria. inria. f r/projects/math/manif esto .html. 

Such gigantic projects make the Fundamental Theorem of Algebra look like a toy ex- 
ample, but its formalization is by no means a trivial task. A constructive proof, along the 
lines of Hellmuth Kneser (1940) and Martin Kneser (1981), has been formalized by the 
FTA project at Nijmegen (www.cs.ru.nl/~freek/fta) using the COQ proof assistant 
(pauillac . inria. fr/coq). Work is in progress so as to extract the algorithm implicit in 
the proof (c-corn . cs . ru.nl). 

Here the real-algebraic approach offers clear advantages, mainly its conceptual simplic- 
ity and its algorithmic character The latter is an additional important aspect: the theorem 
is not only an existence statement but immediately translates to an algorithm. A formal 
proof of the theorem will also serve as a formal proof of the implementation. As a first 
step, Mahboubi [34] discusses a formal proof of the subresultant algorithm. 

8. Historical remarks 

The Fundamental Theorem of Algebra is a crowning achievement in the history of 
mathematics. In order to place our real-algebraic approach into perspective, this section 
sketches its historical context. For the history of the Fundamental Theorem of Algebra I 
refer to Remmert [43], Dieudonne [13, chap. II, §111], and van der Waerden [63, chap. 5]. 
The history of Sturm's theorem has been examined in great depth by Sinaceur [51]. 
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8.1. Solving polynomial equations. The method to solve quadratic equations was known 
to the Babylonians. Not much progress was made until the 16th century, when del Ferro 
(around 1520) and Tartaglia (1535) discovered a solution for cubic equations by radicals. 
Cardano's student Ferrari extended this to a solution of quartic equations by radicals. Both 
formulae were published in Cardano's Ars Magna in 1545. Despite considerable efforts 
during the following centuries, no such formulae could be found for degree 5 and higher. 
They were finally shown not to exist by Ruffini (1805), Abel (1825), and Galois (1831). 
This solved one of the outstanding problems of algebra, alas in the negative. 

The lack of general formulae provoked the question whether solutions exist at all. The 
existence of « roots for each real polynomial of degree n was mentioned by Roth (1608) 
and explicitly conjectured by Girard (1629) and Descartes (1637). They postulated these 
roots in some extension of M but did not claim that all roots are contained in the field 
C = M[;]. Leibniz (1702) even speculated that this is in general not possible. 

The first proofs of the Fundamental Theorem of Algebra were published by d' Alembert 
(1746), Euler (1749), Lagrange (1772), and Laplace (1795). In his doctoral thesis (1799) 
Gauss criticized the shortcomings of all previous tentatives and gave what is commonly 
considered as the first rigorous proof of the Fundamental Theorem of Algebra. 

8.2. Gauss' first proof. Gauss considers F = Z" +c„_iZ""' H \-ciZ + cq; upon sub- 
stitution of Z = X + iY he obtains F = R + iS with R,S e R[X,Y]. The roots of F are 
precisely the intersections of the two curves R = and 5 = in the plane. Near a circle dF 
with sufficiently large radius around 0, these curves resemble those of Z". The latter are 2n 
straight lines passing through the origin. The circle dT thus intersects each of the curves 
R = Q and 5 = in 2« points placed in an alternating fashion around the circle. 

Prolongating these curves into the interior of F, Gauss concludes that the curves R = 
and S = must intersect somewhere inside the circle. This conclusion relies on certain 
(intuitively plausible) assumptions, which Gauss clearly states but does not prove. 
Satis bene certe demonstratum esse videtur, curvam algebraicam neque 
alicubi subito abrumpi posse (uti e.g. evenit in curva transscendente, cuius 
aequatio y = 1 / logx), neque post spiras infinitas in aliquo puncto se quasi 
perdere (ut spiralis logarithmica), quantumque scio nemo dubium contra 
hanc rem movit. Attamen si quis postulat, demonstrationem nulhs dubiis 
obnoxiam alia occasione tradere suscipiam.' [20, Bd. 3, p. 27] 
To modern standards Gauss' first proof is thus incomplete. The unproven assertions are 
indeed correct, and have later been rigorously worked out by Ostrowski [38, 39]. 

Notice that Gauss' argument shows that ind(F|^r) = n, and our development of the 
algebraic index exhibits a short and rigorous path to the desired conclusion. Our proof can 
thus be considered as an algebraic version of Gauss' first proof, suitably completed by the 
techniques of Sturm and Cauchy, and justified by the intermediate value theorem. 

8.3. Gauss' further proofs. Gauss gave two further proofs in 1816, and a fourth proof 
in 1849 which is essentially an improved version of his first proof [63, chap. 5]. The 
second proof is algebraic (§8.8.2), the third proof uses integration (§8.8.3) and foreshadows 
Cauchy's integral formula for the winding number. 

When Gauss published his fourth proof in 1849 for his doctorate jubilee, the works 
of Sturm (1835) and Cauchy (1837) had been known for several years, and in particular 
Sturm's theorem had immediately risen to international acclaim. In principle Gauss could 
have taken up his first proof and completed it by arguments similar to the ones presented 
here. This has not happened, however, so we can speculate that Gauss was perhaps unaware 



It seems to be well demonstrated that an algebraic curve neither ends abruptly (as it happens in the tran- 
scendental curve y = 1/logx), nor does it quasi loose itself after an infinite number of windings in a point (like a 
logaiithmic spiral). As far as I know nobody has ever doubted this, but if anybody requires it, I take it on me to 
present, on another occasion, an indubitably proof. (Translation cited from van der Waerden [63, p. 96]) 
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of the work of Sturm, Cauchy, and Sturm-Liouville on complex roots of polynomials. 
Completing Gauss' geometric argument, Ostrowski [39] mentions the relationship with 
the Cauchy index but builds his proof on topological arguments. 

8.4. Sturm, Cauchy, Liouville. In 1820 Cauchy proved the Fundamental Theorem of 
Algebra, using the existence of a global minimum zo of \F\ and a local argument showing 
that F{zo) = 0, see §8.8.1. While the local analysis is rigorous, global existence requires 
some compactness argument, which was yet to be developed, see Remmert [43, §1.8]. 

Sturm's theorem for counting real roots was announced in 1829 [53] and published in 
1835 [54]. It was immediately assimilated by Cauchy in his residue calculus [8], based on 
complex integration, which was published in 1831 during his exile in Turin. In 1837 he 
published a more detailed exposition [9] with more direct, analytic-geometric proofs, but 
explicitly recognizes the relation to Sturm's theorem [9, pp. 426-427,431]. 

In the intervening years, Sturm and Liouville [57, 55] had elaborated their own proof 
of Cauchy 's theorem, which they published in 1836. (Loria [33] and Sinaceur [51, I. VI] 
examine the interaction between Sturm, Liouville, and Cauchy in detail.) As opposed 
to Cauchy, their arguments are based on what they call the "first principles of algebra". 
In the terminology of their time this means the theory of complex numbers, including 
trigonometric coordinates z = r(cos0 + ;sin0) and de Moivre's formula, but excluding 
integration. Furthermore they use sign variations and, of course, the intermediate value 
theorem of real functions, as well as tacit compactness arguments. 

8.5. Sturm's vision revived. Sturm, in his article [55] continuing his work with Liouville 
[57], presents arguments which closely parallel our real-algebraic proof: the argument 
principle (Prop. 1, p. 294), multiplicativity of the index (Prop. 2, p. 295), counting roots 
of a split polynomial within a given region (Prop. 3, p. 297), the index in the absence of 
zeros (Prop. 4, p. 297), and finally Cauchy's theorem (p. 299). One crucial step is to show 
ind(F|(5r) = when F does not vanish in P. This is solved by subdivision and a tacit 
compactness argument (pp. 298-299); our compactness proof of Theorem 5.3 completes 
his argument. Sturm then deduces the Fundamental Theorem of Algebra (pp. 300-302) 
and expounds on the practical computation of the Cauchy index ind(F|(9r) using Sturm 
chains as in the real case (pp. 303-308). 

Sturm's exposition strives for algebraic simplicity, but his arguments are ultimately 
based on geometric and analytic techniques. It is only on the final pages that Sturm em- 
ploys his algebraic method for computing the Cauchy index. This mixed state of affairs 
has been passed on ever since, even though it is far less satisfactory than Sturm's purely 
algebraic treatment of the real case. Our proof shows that Sturm's vision of the complex 
case can be salvaged and his arguments can be put on firm real-algebraic ground. 

We note that Sturm and Liouville explicitly exclude zeros on the boundary: 

Toutefois nous excluons formellement le cas particulier oil, pour quelque 
point de la courbe ABC, on aurait a la fois P = 0, Q = . ce cas particu- 
lier ne jouit d'aucune propriete reguliere et ne peut donner lieu a aucun 
theoreme.2 [57, p. 288] 

This seems overly pessimistic in view of our Theorem 1.9 above. In his continuation 
[55], Sturm formulates the same problem much more cautiously: 

C'est en admettant cette hypothese que nous avons demontre le theoreme 
de M. Cauchy ; les modifications qu'il faudrait y apporter dans le cas oil 
il aurait des racines sur le contour meme ABC, exigeraient une discussion 



We formally exclude, however, the case where for some point of the curve ABC we have simultaneously 
P = and Q = 0: this special case does not enjoy any regular property and cannot give rise to any theorem. 
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longue et minutieuse que nous avons voulu eviter en faisant abstraction 
de ce cas particulier.^ [55, p. 306] 

It seems safe to say that our detailed discussion is just as "long and meticulous" as the 
usual development of Sturm's theorem. Modulo these details, the cited works of Gauss, 
Cauchy, and Sturm contain the essential ideas for the real-algebraic approach. It remained 
to work them out. To this end our presentation refines the techniques in several ways: 

• We purge all arguments of transcendental functions and compactness assumptions. 
This simplifies the proof and generalizes it to real closed fields. 

• The product formula (§4.5) and homotopy invariance (§5.3) streamline the proof 
and avoid tedious calculations. 

• The uniform treatment of boundary points extends Sturm's theorem to piecewise 
polynomial functions and leads to straightforward algorithms. 

8.6. Further development in the 19th century. Sturm's theorem was a decisive step in 
the development of algebra as an autonomous field, independent of analysis, in particular 
in the hands of Sylvester and Hermite. For a detailed discussion see Sinaceur [51]. 

In 1869 Kronecker [29] constructed his higher-dimensional index (also called Kro- 
necker characteristic) using integration. His initial motivation was to generalize Sturm's 
theorem to higher dimensions, extending previous work of Sylvester and Hermite, but he 
then turned to analytic methods in order to solve the foundational difficulties of his index 
theory. Subsequent work was likewise built on analytic methods over M: one gains in 
generality by extending the index to smooth or even continuous functions, but one loses 
algebraic generality, simplicity, and computability. 

The problem of stability of motion led Routh [44] in 1878 and Hurwitz [24] in 1895 
to count the number of complex roots having negative real part (§6.1). With the cele- 
brated Routh-Hurwitz theorem, the algebraic index has transited from algebra to applica- 
tion, where it survives to the present day. In the 1898 Encyklopddie der mathematischen 
Wissenschaften [36, Band I], Netto's survey on the Fundamental Theorem of Algebra (§1- 
Bla7) mentions Cauchy's algebraic approach only briefly (p. 236), while Runge's article 
on approximation of complex roots (§I-B3a6) discusses Cauchy's method in greater detail 
(pp. 418-422). In the 1907 Encyclopedic des Sciences Mathematiques [37], Netto and le 
Vavasseur give an overview of nearly 100 published proofs (tome I, vol.2, §80-88), in- 
cluding Cauchy's argument principle (§87). The work of Sturm-Liouville [57, 55] is cited 
but the algebraic approach via Sturm chains is not mentioned. 

8.7. 19th century textbooks. While Sturm's theorem made its way from 19th century 
algebra to modern algebra textbooks and is still taught today, it seems that the algebraic 
approach to the complex case has been lost on the way. Let me illustrate this by two 
prominent and perhaps representative textbooks. 

In his 1877 textbook Cours d'algebre superieure, Serret [50, pp. 11 8-1 32] presents the 
proof of the Fundamental Theorem of Algebra following Cauchy and Sturm-Liouville, 
with only minor modifications. Two decades later, Weber consecrated over 100 pages to 
real-algebraic equations in his 1 898 textbook Lehrbuch der Algebra [64], where he presents 
Sturm's theorem in great detail (§91-106). Calling upon Kronecker's geometric index the- 
ory (§100-102), he sketches how to count complex roots (§103-104). Quite surprisingly, 
he uses only Ind(^) and Corollary 3.23 where the general case Ind(^) and Theorem 3.20 
would have been optimal. Here Cauchy's algebraic method [9], apparently unknown to 
Weber, had gone much further concerning explicit formulae and concrete computations. 



It is under this liypothesis that we have proven the theorem of Mr. Cauchy; the necessary modifications in 
the case where roots were on the contour ABC would require a long and meticulous discussion, which we have 
wanted to avoid by neglecting this special case. 
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8.8. Survey of proof strategies. Since the time of Gauss numerous proofs of the Funda- 
mental Theorem of Algebra have been developed. We refer to Remmert [43] for a concise 
overview and to Fine-Rosenberger [ 1 6] for a text-book presentation. As mentioned in § 1 .2, 
the proof strategies can be grouped into three families: 

8.8.1. Analysis. Proofs in this family are based on the existence of a global minimum zo of 
|F| and some local argument from complex analysis showing that F{zo) ^ (d'Alembert 
1746, Argand 1814, Cauchy 1820). See Remmert [43, §2] for a presentation in its his- 
torical context, or Rudin [47, chap. 8] in the context of a modern analysis course. In its 
most succinct form, this is formulated by Liouville's theorem for entire functions. Such 
arguments are in general not constructive; for constructive refinements see [43, §2.5]. 

8.8.2. Algebra. Proofs in this family use the fundamental theorem of symmetric polyno- 
mials in order to reduce the problem from real polynomials of degree 2''m with m odd to 
degree 2*^^'m' with m' odd (Euler 1749, Lagrange 1772, Laplace 1795, Gauss 1816, see 
[43, appendix]). The argument can be reformulated using Galois theory, see Cohen [11, 
Thm. 8.8.7], Jacobson [26, Thm. 5.2], or Lang [30, §VI.2, Ex. 5]. The induction is based, 
for k = Q,on real polynomials of odd degree, where the existence of at least one real root is 
guaranteed by the intermediate value theorem. This algebraic proof thus works over every 
real closed field. It is constructive but ill-suited to actual computations. 

8.8.3. Topology. Proofs in this family use, in one form or another, the winding number 
ind(7) of closed paths 7: [0,1] ^C* (Gauss 1799/1816, Cauchy 1831/37, Sturm-Liouville 
1836). As mentioned in Remark 1.5, the winding number can be defined in various ways. 
In each case the difficulty is to rigorously construct the index and to estabhsh its charac- 
teristic properties: multiplicativity and homotopy invariance, as stated in Theorem 1.2. 

Our proof belongs to this last family. Unlike previous proofs, however, we do not base 
the index on analytical or topological arguments but on real algebra. 

8.9. Constructive and algorithmic aspects. Sturm's method is eminently practical, by 
the standards of 19th century mathematics as for modern-day implementations. As early 
as 1840 Sylvester [58] wrote "Through the well-known ingenuity and proferred help of a 
distinguished friend, I trust to be able to get a machine made for working Sturm's theo- 
rem (...)". It seems, however, that such a machine was never built. Calculating machines 
had been devised by Pascal, Leibniz, and Babbage; the latter was Lucasian Professor of 
Mathematics at Cambridge when Sylvester studied there in the 1830s. 

The idea of computing machinery seems to have been common among mid- 19th century 
mathematicians. In a small note of 1846, Ullherr [62] remarks that the argument principle 
leads to a complex root finding algorithm: "Die bei dem ersten Beweise gebrauchte Be- 
trachtungsart giebt ein Mittel an die Hand, die Wurzeln der hoheren Gleichungen mittels 
eines Apparates mechanisch zu finden." [The viewpoint used in the first proof provides a 
method to find the roots of higher-degree equations by means of a mechanical apparatus.] 

The state of the art in separating and approximating roots at the end of the 19th century 
has been surveyed in Runge's Encyklopddie article [36, Band I, §I-B3a]. 

In 1924 Weyl [65] reemphasized that the analytic index ind(F|(9r) = ^Fi hr TJ^'^^ 
be used to find and approximate the roots of F. In this vein Weyl formulated his construc- 
tive proof of the Fundamental Theorem of Algebra, which indeed translates to an algo- 
rithm: a careful numerical approximation can be used to calculate the integer ind(F|(9r), 
see Henrici [23, §6.11]. While Weyl's motivation may have been philosophical, it is the 
practical aspect that has proven most successful. Variants of Weyl's algorithm are used 
in modern computer implementations for finding approximate roots, and are among the 
asymptotically fastest known algorithms. The question of algorithmic complexity was 
pursued by Schonhage [49] and others since the 1980s. See Pan [41] for an overview. 
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The fact that Sturm's and Cauchy's theorems together can be applied to count complex 
roots seems not to be as widely known as it should be. In the 1969 Proceedings [12] on 
constructive aspects of the Fundamental Theorem of Algebra, Cauchy's algebraic method 
is not mentioned. Lehmer [31] uses a weaker form, the Routh-Hurwitz theorem, although 
Cauchy's general result would have been better suited. Cauchy's method reappears in 1978 
in a small note by Wilf [67], and is briefly mentioned in Schonhage's technical report [48]. 

Most often the computer algebra literature credits Weyl for the analytic-numeric algo- 
rithm, and Lehmer or Wilf for the algebraic-numeric method, but not Cauchy nor Sturm. 
Even if Cauchy's index and Sturm's algorithm are widely used, their algebraic contribu- 
tions to complex root location seem to be largely ignored. 

9. Conclusion 

The Fundamental Theorem of Algebra is one of the most classical results of mathemat- 
ics, and consequently a heavily tilled field. Many beautiful proofs are known. It should be 
emphasized, however, that a non-constructive existence proof only "announces the pres- 
ence of a treasure, without divulging its location", as Hermann Weyl put it. 

The constructive, real-algebraic proof presented here is based on ideas of Gauss, Sturm, 
and Cauchy. From a modern perspective it is a very natural approach, but to the best of 
my knowledge it is worked out here for the first time. The resulting proof is elementary, 
elegant, and effective, and thus has all desirable properties that one could wish for 

Here the intuitive wording "elementary" can be given the precise meaning that the proof 
is formulated in the first-order language of ordered fields. Likewise "elegant" can be inter- 
preted as saying that the formalization carries meaning: it algebraically captures the geo- 
metrically intuitive notion of winding number. Finally, "effective" has the precise mean- 
ing of algorithmic feasibility: our formulation is completely explicit and directly imple- 
mentable. (The adjective "efficient", however, is reserved for minimizing the complexity.) 
While each of these virtues can be attained individually, their conjunction is remarkable. 
So if you asked me "Is this the ultimate proof?" I would say "Yes. And so will be each 
subsequent proof yet to be discovered." 
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